C stdlib isn't threadsafe and even safe Rust didn't save us (opens in new tab)

(edgedb.com)

327 pointsmsully43211y ago362 comments

362 comments

167 comments · 28 top-level

mmastrac1y ago· 33 in thread

The major takeaway from this is that Rust will be making environment setters unsafe in the next edition. With luck, this will filter down into crates that trigger these crashes (https://github.com/alexcrichton/openssl-probe/issues/30 filed upstream in the meantime).

usefulcat1y ago

But that won't actually fix the underlying problem, namely that getenv and setenv (or unsetenv, probably) cannot safely be called from different threads.

It seems like the only reliable way to fix this is to change these functions so that they exclusively acquire a mutex.

eqvinox1y ago

I have a different perspective: the underlying problem is calling setenv(). As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv. It's not a mechanism for exchanging information within a process, as used here with SSL_CERT_FILE.

And remember that the exec* family of calls has a version with an envp argument, which is what should be used if a child process is to be started with a different environment — build a completely new structure, don't touch the existing one. Same for posix_spawn.

And, lastly, compatibility with ancient systems strikes again: the environment is also accessible through this:

   extern char **environ;

Which is, of course, best described as bullshit.

diroussel1y ago

Indeed, environment variables should be used to configure child processes, not to configure the current process, for non-shell programs, IMHO.

Note that Java, and the JVM, doesn't allow changing environment variables. It was the right choice, even if painful at times.

4 more replies

harrall1y ago

You can’t convince me that there is EVER a reason to call setenv() after program init as part of a regular program, outside needing to hack around something specific.

Environmental variables are not a replacement for your config. It’s not a place to store your variables.

Even if the env var API is fully concurrent, it is not convention to write code that expects an env var to change. There isn’t even a mechanism for it. You’d have to write something to poll for changes and that should feel wrong.

2 more replies

lamontcg1y ago

Environment variables are a gigantic, decades-old hack that nobody should be using... but instead everyone has rejected file-based configuration management and everyone is abusing environment variables to inject config into "immutable" docker containers...

5 more replies

stouset1y ago

> As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv.

This holds for a lot of programs, but what if you're writing a shell?

3 more replies

kazinator1y ago

The underlying problem isn't just setenv, because the string returned by getenv can be invalidated by another call to getenv. ISO C says:

"The getenv function returns a pointer to a string associated with the matched list member. The string pointed to shall not be modified by the program, but can be overwritten by a subsequent call to the getenv function."

In a single threaded virtual machine, you can immediately duplicate the string returned by getenv and stop using it, right there.

Under threads, getenv is not required to be safe.

I think that with some care, it may be; an environment implementation could guarantee that a non-mutating operation like getenv doesn't invalidate any previously returned strings.

I think POSIX does that. It allows getenv to reallocate the environ array, but not the strings themselves:

"Applications can change the entire environment in a single operation by assigning the environ variable to point to an array of character pointers to the new environment strings. After assigning a new value to environ, applications should not rely on the new environment strings remaining part of the environment, as a call to getenv(), secure_getenv(), [XSI] [Option Start] putenv(), [Option End] setenv(), unsetenv(), or any function that is dependent on an environment variable may, on noticing that environ has changed, copy the environment strings to a new array and assign environ to point to it."

environ is documented together with the exec family of functions; that's where this is found.

So whereas there are things not to like about environ, it can be the basis for thread safety of getenv in an application that doesn't mutate the environment.

Joker_vD1y ago

> As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv.

Mutating argv is actually quite popular, or at least it used to be.

2 more replies

debugnik1y ago

No amount of locking can make the getenv API thread-safe, because it returns a pointer which gets invalidated by setenv, but lacks a way to release ownership over it and unblock setenv safely (or to free a returned copy).

So setenv's existence makes getenv inherently unsafe unless you can ensure the entire application is at a safe point to use them.

tsimionescu1y ago

This is actually not that hard to fix.

Getenv() could keep several copies of the value around: one internal copy protected by a mutex, that it never returns, and one copy per thread that it stores in thread local storage. When you call getenv(), it locks the mutex, checks if the current thread's value exists, populates it from the internal copy if not, and returns it. It will also install a new setenv-specific signal handler on this thread and store info about this thread having a copy.

Setenv() will then take the same mutex as getenv(), check if the internal copy is different from the new value; if it is, it will modify the internal copy, modify the local thread's copy if that has one, and then signal each other thread in the process that has a copy in TLS. The setenv signal handler will modify the local copy that thread holds.

It's gonna be slow for a large multi-threaded program, but since setenv() used to corrupt memory for such programs, they probably don't care. And for single-threaded programs, or even for programs that don't access getenv()/setenv() on multiple threads, there should be no extra overhead other than the mutex and the bookkeeping.

The only issues that would remain are programs which send the pointer they get from getenv() to other threads without ensuring locking access, and programs which rely on modifying the pointer from getenv() directly as a way to set an env var, and expect this to be visible across threads. Those are just hopelessly broken and can't use the same API - but aren't more broken then they are today.

Of course, in addition to this complex work to make the old API (mostly) thread safe, it should also offer a new API that simply returns a copy every time, doesn't promise to show modifications to your copy when setenv() gets called (you need to call getenv() again), and puts the onus on you to free that copy explicitly.

4 more replies

kazinator1y ago

According to ISO C, getenv returns a pointer to storage that can be overwritten by another call to getenv! Only POSIX slightly fixes it: the string comes from the environ array, and operations on environ by the library preserve the strings themselves (when not replacing or deleting them), just not the array. A program that calls nothing but getenv is okay on POSIX, not necessarily on ISO C.

josefx1y ago

C could provide functions to lock/unlock a mutex and require that any attempt to access the environment has to be done holding the mutex. This would still leave the correctness in the hands of the user, but at least it would provide a standard API to secure the environment in a multi threaded application that library and application developers could adopt.

1 more reply

xxs1y ago

They can have copy on write of course.

Ferret74461y ago

Is that a problem? I feel like calling getenv and setenv from different threads is a design antipattern anyway. Any environment setting and loading should happen in the one and only main thread right after process init.

pshc1y ago

The underlying problem is that setenv is mutable global state and should never have existed

3 more replies

kazinator1y ago

The mutex would have to be held by the caller until it no longer needs the string returned from the environment, or makes a copy:

   stdenvlock();    // imaginary function added to ISO C or POSIX
   char *home = getenv("HOME");
   char *home_copy = strdup(home);
   stdenvunlock();  // only here can we unlock
   // home pointer is now indeterminate

Other solutions:

1. Put the above sequence into a function, and don't expose the mutex. Thread-safe code must use:

   char *home = dupenv("HOME"); // imaginary function; caller responsible for freeing.

2. Provide environment lookup into a buffer:

   getenvbuf("HOME", mybuf, sizeof mybuf);  // returns some value that helps to resize the buffer

All functions that retain pointers out of the classic getenv remain unsafe.

A mutex can be provided to those applications that want to manipulate the environ array directly, or use getenv and setenv, or any combinations of these.

The main problem is all the code out there using getenv.

liontwist1y ago

Please no.

If your program wants to use the environment as an out-of-band global var for cross thread communication, you can make your own mutex.

1 more reply

ModernMech1y ago

It's the same problem with global vars, but at a machine scope. The real solution here would be for the OS to have a better interface to read and write env vars, more like a file where you have to get rw permission (whether that's implemented as a mutex or what).

eqvinox1y ago

This is neither an OS nor a machine scope problem. The environment is provided by the OS at startup. What the process does with it from there on is its own concern.

1 more reply

belter1y ago

> It seems like the only reliable way to fix this is to change these functions so that they exclusively acquire a mutex.

A mutex can ensure thread safety but risks deadlocks if not used carefully and will hurt performance...

hamandcheese1y ago

Agree about performance, but wouldn't there need to be >1 mutex to risk a deadlock?

2 more replies

fch421y ago

The problem (with get/set/putenv as they are) was isn't the non-use of a mutex. It's the "meaning" of the pointer returned to by getenv(). It returns a char*. Nevermind the persistance of that value - you can work around that by deliberately leaking memory - but it's writeable. Whether it's a good idea to do so ... well. But simply locking "inside" these funcs doesn't solve all the / your issues.

goeiedaggoeie1y ago

setenv and getenv have never been thread safe, why the concern with it now?

1 more reply

loeg1y ago

Is that the underlying problem, or is the underlying problem that libraries are using thread-unsafe setenv in threaded contexts when they could just do something else?

db48x1y ago

But it would force Rust programs to add their own synchronization mechanism around them. As long as no two threads can call getenv/setenv at the same time then it’s fine.

kibwen1y ago

The problem isn't something that Rust can solve.

The Rust stdlib is already using synchronization on the versions of these functions that are exposed from the Rust stdlib. That's why those functions were allowed to be marked as safe in the first place.

The problem is that people are calling C code from Rust (which already requires an unsafe annotation), and then that C code is doing silly thread-unsafe shenanigans for regrettable historical reasons.

It's beyond Rust's power to fix without cooperation from the underlying C code, which happens to be provided by the OS, which is just being compliant with Posix. Rust can only do so much when the platform itself is hell-bent on sabotaging you.

1 more reply

thayne1y ago

In particular, it doesn't help if you call a c function that indirectly modifies the environment with FFI.

zamalek1y ago

> Nowadays the best solution to this issue is "stop using this crate" with libraries like rustls.

Nice to see that the author of the library has a sensible take. Unfortunately the ecosystem does not: https://github.com/seanmonstar/reqwest/blob/master/Cargo.tom...

benatkin1y ago

People get trained to ignore the ____UNSAFE_payattention__nevermindthatthisappears50timesinthisfile___ blocks and prefixes

This also shows up in web frameworks where Vue has the v-html directive and react has dangerouslySetInnerHTML. Vue definitely has it better.

crooked-v1y ago

In the React world, the only times I've seen dangerouslySetInnerHTML consistently used is for outputting string literal CSS content (and this one is increasingly rare as build tools need less handholding), string literal JSON content (for JSON+LD), and string literal premade scripts (i.e. pixel tags from the marketing content). That's not to say there's no danger surface there, but it's not broadly used as a tool outside of code that's either really bad or really exhaustively hand-tuned.

javier21y ago

I've only really seen dangerouslySetInnerHTML used while transitioning from certain kinds of server side rendering to React. There is still lots of really old internal tools in ancient html out there.

rerdavies1y ago

Code syntax highlighting libraries for react use dangerouslySetInnerHTML.

benatkin1y ago

React doesn't have a tag and attribute sanitizer built in, so having non-js-programmers edit JSX isn't especially safe anyways, as an img or a href could exfiltrate data. If it were they could just block out an innerHTML attribute. A js programmer can get around it by setting up a ref and then using the reference to set innerHTML without the word dangerously appearing.

1 more reply

forrestthewoods1y ago· 20 in thread

Mutable global state is evil. Friends don’t let friends use mutable global state.

I hate envvars. It’s “the Linux way”. I avoid them like the plague. A++ strong recommend.

libc is terrible. The world needs to move on.

01HNNWZ0MV43FF1y ago

Env vars are good if you treat them as read-only within the process

forrestthewoods1y ago

I’ll take a config file over an envvar 100% of the time.

msully4321OP1y ago

Yeah, setenv should probably just not exist, and environment variables should be only set when spawning new processes.

2 more replies

maep1y ago

> Mutable global state is evil. Friends don’t let friends use mutable global state.

Throw away your CPU and RAM then.

snowfarthing1y ago

There are certainly levels of the abstraction pyramid where mutable global state is unavoidable; however, it shouldn't be too difficult to get to a point where we have enough abstraction so that we don't need to worry about mutable global state for what we do.

And even if those abstractions can't be 100% effective, we'd go a long way to achieving the desirable results of getting rid of it, if we just develop the mindset of avoiding it if at all possible, excepting for very rare instances where it's needed as a last resort.

incrudible1y ago

Your CPU has an MMU in order to (among other things) let the OS prevent mutable global state.

forrestthewoods1y ago

I can not possibly roll my eyes hard enough.

Go ahead and write lots of mutable global statics. But when your program crashes randomly and you need my help to debug and it is, once again, a global mutable then you have to perform a walk of shame.

titzer1y ago

And disks. And the cloud. Or basically, you know, computers.

kibwen1y ago

Don't threaten me with a good time.

incrudible1y ago

Ah yes, the cloud where we all happily share compute resources without any restrictions to avoid stomping on each others toes.

layer81y ago

The universe, you mean.

sim7c001y ago

what do you suggest as alternative?

the problem is not linux, not mutable global state or resources and not libc.

the problem is not getting time at work to do things properly. like spotting this in GDB before the issue hit, because your boss gave you time to tirelessly debug and reverse your code and anything it touches....

there is too much money in halfbaked code. sad but true.

viraptor1y ago

It definitely is the current libc. That one's proven by systems which do not have the same problem. Then the next layer problem is trying to pretend we can get everyone to pay attention and avoid bugs in code instead of forcing interfaces and implementations where those bugs are not possible.

sim7c001y ago

just because someone makes a window doesn't mean you gotta jump out of it. there are good and bad uses for things, and the bad ones should be avoided lest one hurt themselves?

1 more reply

glouwbug1y ago

libc moved the world into the Information Age

kibwen1y ago

In the same way that Yersinia pestis moved the world into the Renaissance?

glouwbug1y ago

Yes, neither were memory or thread safe

jimbob451y ago

What's your preferred alternative?

andrewmcwatters1y ago

Don’t use a mouse or a monitor then.

snowfarthing1y ago

One of the reasons X is being fazed out in favor of Wayland is because X is far more global than it needs to be -- and this is one of the reasons it has security risk that can't be completely removed without API-breaking effects.

shikon71y ago· 18 in thread

I wonder why it is so hard for Rust to implement its own safe stdlib independent of C.

dgrunwald1y ago

How exactly would that help in this situation?

If both Rust and C have independent standard libraries loaded into the same process, each would have an independent set of environment variables. So setting a variable from Rust wouldn't make it visible to the C code, which would break the article's usecase of configuring OpenSSL.

The only real solution is to have the operating system provide a thread-safe way of managing environment variables. Windows does so; but in Linux that's the job of libc, which refuses to provide thread-safety.

rcxdude1y ago

If there was a libc implemented in rust (like https://github.com/redox-os/relibc), you could use that for the C code in the process, and you'd be sharing the relevant state.

do_not_redeem1y ago

The crash in the article happened when Python called C's getenv. Rust could very well throw away libc, but then it would also be throwing away its great C interop story. Rust can't force Python to use its own stdlib instead of libc.

steveklabnik1y ago

Linux is an unusual platform in that it allows you to call into it via assembly. Most other platforms require you to go through libc to do so. It's not really in Rust's hands.

PaulDavisThe1st1y ago

This is not unusual at all. Windows allowed it for years before Linux came along. It was also true of some other *nix systems - IIRC, Ultrix (DEC) allowed this, and so did Dynix (Sequent).

*BSD allows it too, or used as of 2022.

What is unusual about Linux is that it guarantees a syscall ABI, meaning that if you follow it, you can make a system call "portably" across "any" version of Linux.

1 more reply

kbolino1y ago

They did, it's called core. But it assumes no operating system at all, and environment variables require an operating system.

nomel1y ago

> and environment variables require an operating system

Is that true? It's just a process global string -> string map, that can be pre-loaded with values before the process starts, with a copy of the current state being passed to any sub-process. This could be trivially implemented with batch processing/supervisory programs.

kbolino1y ago

Sure, there's a broader concept here, which doesn't require any operating system. But any alternate string->string map you define won't answer to C code calling getenv, won't be passed to child processes created with fork, won't be visible through /proc/$PID/environ, etc.

1 more reply

panzi1y ago

Well, it's used by the OS when exec-ing a new process, but at least the Linux syscall for that takes the environment as an explicit parameter. So it could be managed in whatever way by the runtime until execve() is called.

sunshowers1y ago

Environment variables are not just technical, they're social. You need to get everyone on board with your scheme.

xxs1y ago

>and environment variables require an operating system.

Yes to read them, if Rust wish to modify - modify your own, already copied structure. I'd do that in pretty much any language.

zanderwohl1y ago

It would be a tremendous amount of work, and would take years. Meanwhile, the problems are avoidable. It's not exactly the "rust way" to just remember and avoid problems, but everything in language design is compromises.

IshKebab1y ago

"Impossibru!!"

https://github.com/sunfishcode/eyra

Oh look:

> Why use Eyra? It fixes Rust's set_var unsoundness issue. The environment-variable implementation leaks memory internally (it is optional, but enabled by default), so setenv etc. are thread-safe.

zanderwohl1y ago

> Why not use Eyra?

Well, that's a lot of caveats. As I said, it would take years to complete. And it looks like it's well on its way but not near complete.

dvtkrlbs1y ago

If I understand it correctly this still doesnt help with downstream dependencies.

kbolino1y ago

That's quite a trade-off

2 more replies

sunshowers1y ago

That only works on Linux though right?

v3xro1y ago

There are rust libc implementations e.g. one by Redox: https://gitlab.redox-os.org/redox-os/relibc

ChrisSD1y ago· 13 in thread

In the Rust std, `set_var` and `remove_var` will correctly require using an `unsafe {}` block in the next edition (2024). The documentation does now mention the safety issue but obviously it was a mistake to make these functions safe originally (albeit a mistake even higher level languages have made).

https://doc.rust-lang.org/stable/std/env/fn.set_var.html

There is a patch for glibc which makes `getenv` safe in more cases where the environment is modified but C still allows direct access to the environ so it can't be completely safe in the face of modification https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f...

jrmg1y ago

Wow, glibc now

keep[s] older versions around and adopt[s] an exponential resizing policy. This results in an amortized constant space leak per active environment variable, but there already is such a leak for the variable itself (and that is even length-dependent, and includes no-longer used values).

There have got to be pathalogical uses out there where this will cause unbounded memory growth in well-formed (according to the API) programs, no?

Interesting to see this _introduce_ a ‘bug’ (unbounded memory growth) for these programs that follow the API in order to ‘fix’ programs that don’t (by using the API in multiple threads). Pragmatism over dogma I guess. Leaves me feeling a bit sketched out though.

GoblinSlayer1y ago

FWIW you can make a singly linked list with infinite number of nodes too. Memory leaks happen in well formed programs just fine, glibc is just one of many examples.

Thaxll1y ago

Why requiring unsafe when the std implementation could take care of the synchronisation?

masklinn1y ago

Because the std implementation can not force synchronisation on the libc, so any call into a C library which uses getenv will break... which is exactly what happened in TFA: `openssl-probe` called env::set_var on the Rust side, and the Python interpreter called getenv(3) directly.

rerdavies1y ago

But the standard implementation could copy the environment at startup, and only uses its copy.

And the library's use of setenv is clearly a bug as setenv is documented to be not threadsafe in the C standard library. So that would take care of that problem.

1 more reply

miohtama1y ago

Is it possible to skip libc completely or would this introduce too many portability concerns?

3 more replies

ChrisSD1y ago

It can only synchronize if everything using is Rust's functions. But that's not a given. People can use C libraries (especially libc) which won't be aware of Rust's locks. Or they could even use a high level runtime with its own locking but then they'll be distinct from Rust's locks.

The only way to coordinate locking would be to do so in libc itself.

wahern1y ago

libc does do locking, but it's insufficient. The semantics of getenv/setenv/putenv just aren't safe for multi-threaded mutation, period, because the addresses are exposed. It's not really even a C language issue; were you to design a thread-safe env API, for C or Rust, it would look much different, likely relying on string copying even on reads rather than passing strings by reference (reference counted immutable strings would work, too, but is probably too heavy handed), and definitely not exposing the environ array.

The closest libc can get to MT safety is to never deallocate an environment string or an environ array. Solaris does this--if you continually add new variables with setenv it just leaks environ array memory, or if you continually overwrite a key it just leaks the old value. (IIRC, glibc is halfway there.) But even then it still requires the application to abstain from doing crazy stuff, like modifying the strings you get back from getenv. NetBSD tried adding safer interfaces, like getenv_r, but it's ultimately insufficient to meaningfully address the problem.

The right answer for safe, portable programs is to not mutate the environment once you go multi-threaded, or even better just treat process environment as immutable once you enter your main loop or otherwise finish with initial process setup. glibc could (and maybe should) fully adopt the Solaris solution (currently, IIRC, glibc leaks env strings but not environ arrays), but if applications are using the environment variable table as a global, shared, mutable key-value store, then leaking memory probably isn't what they want, either. Either way, the best solution is to stop treating it as mutable.

1 more reply

demurgos1y ago

It can't ensure synchronization because any code using libc could bypass the sync wrapper. In particular, Rust lets you link C libs which wouldn't use the Rust stdlib.

msully4321OP1y ago

Because it can still race with C code using the standard library. getenv calls are common in C libraries; the call to getenv in this post was inside of strerror.

fsckboy1y ago

you've gotten a lot of answers which say the same thing, but which I don't think answer your question:

synchronization methods impose various complexity and performance penalties, and single threaded applications which don't need that would pay those penalties and get no benefit.

Unix was designed around a lightweight ethos that allowed simple combining of functions by the user on the command line. See "worse is better", but tl;dr that way of doing things proved better, and that's why you find yourself confronting what it doesn't do.

davidt841y ago

The real problem is that getenv() and setenv() were created before threads were really a thing.

1 more reply

sunshowers1y ago

Well it was better in the short term but is worse in the long term. In particular, the error handling situation is generally atrocious, which is fine for interactive/sysadmin use but much worse for serious production use.

vlovich1231y ago· 8 in thread

Even if C stdlib maintainers are resistant against making setenv multi-thread safe, at a minimum there should be a new alternative thread-safe API defined, whether within POSIX or defining a defacto standard and forcing POSIX to adopt it over time. If instead of explaining why nothing could be done was spent fixing this problem, a new thread-safe API could have replaced the old setenv which could have been deprecated and removed from many software projects.

I'm also not convinced by Musl's maintainer that it can't be fixed within Musl considering glibc is making changes to make this a non-issue.

usefulcat1y ago

The biggest problem is not the absence of a thread safe API, it's the existence of this:

    extern char **environ;

As long as environ is publicly accessible, there's no guarantee that setenv and getenv will be used at all, since they're not necessary.

If you're willing to get rid of environ, it's pretty trivial to make setenv and getenv thread safe. If not, then it's impossible, although one could still argue that making setenv and getenv thread safe is at least an improvement, even if it's not a complete solution (aka don't let the perfect be the enemy of the good).

vlovich1231y ago

> aka don't let the perfect be the enemy of the good

Exactly my point. Over time *environ would disappear, at least from the major software projects that everyone uses (assuming it's even in use in them in the first place).

aragilar1y ago

That still doesn't mean getenv would be safe. Unless you know nothing uses **environ (e.g. by breaking the ABI, which no-one will do because it'll break everything), you can't rely on getenv being safe.

1 more reply

IshKebab1y ago

Yeah I don't think I've ever seen a single use of it. However I just checked on grep.app and at least a few big softwares use it - git, nginx, Postgresql, neovim, etc, which suggests that setenv/getenv is not sufficient.

panzi1y ago

Guess that would also require some locking for all the exec() functions that don't take the environment as a parameter or that search PATH for the executable.

davidt841y ago

I'm not convinced by you that you know more than the experts who have determined there is no backwards-compatible way to fix this.

vlovich1231y ago

I'll take existence proofs [1] over personal insults but YMMV. You also may want to be careful assuming the expertise of people on this forum. Some people here are quite technical.

[1] https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f...

davidt841y ago

That isn't thread safe, it's safER.

I am also quite technical, thanks.

1 more reply

jandrese1y ago· 7 in thread

Yet another person is burned by calling setenv() in a multi-threaded context. There really needs to be a big warning banner on the manpage for setenv() that warns about this because it seems like a far more common problem than you would expect.

umpalumpaaa1y ago

The man page says:

> POSIX.1 does not require setenv() or unsetenv() to be reentrant.

A non-reentrant function cannot be thread safe.

In general (for POSIX, libc and many other libraries: if the docs do not explicitly say "this function is thread safe" they are not).

wmf1y ago

It's time to move beyond this attitude and make things safe by default. For example, Solaris has a safer version of setenv().

"It is ridiculous that this has been a known problem for so long. It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it. We know how to fix the problem." https://www.evanjones.ca/setenv-is-not-thread-safe.html

3 more replies

jabl1y ago

> A non-reentrant function cannot be thread safe.

Actually, a non-reentrant function can be thread-safe. A common example of such a function in libc being malloc().

adrian_b1y ago

By definition, a "reentrant function" is a function that may be invoked even when it has not returned yet from a previous invocation.

So a non-reentrant function is a function that may not be invoked again between a previous invocation and returning from that invocation.

When a function may be invoked from different threads, then it is certain that sometimes it will be invoked by a thread before returning from a previous invocation from a different thread.

Therefore any function that may be invoked from different threads must be reentrant. Otherwise the behavior of the program is unpredictable. Reentrant functions may be required even in single-thread programs, when they may be invoked recursively, or they may be invoked by signal handlers.

An implementation of "malloc" may be reentrant or it may be non-reentrant.

Old "malloc" implementations were usually non-reentrant because they used global variables for managing the heap. Such "malloc" functions could not be used in multi-threaded programs.

Modern "malloc" implementations are reentrant, either by using only thread-local storage or by using shared global variables to which some method for concurrent access is implemented, e.g. with mutual exclusion.

2 more replies

sumtechguy1y ago

I could be wrong but isnt that because each thread has its own heap?

01HNNWZ0MV43FF1y ago

Funny enough, the Rust wrapper `std::env::set_var` does have a big warning https://doc.rust-lang.org/std/env/fn.set_var.html

subarctic1y ago

Looks like that Safety section was added in 1.76.0. It'll be an even bigger warning in the future since it's now going to be unsafe in Rust 2024

StillBored1y ago· 6 in thread

Its like a rite of passage to be hit by an environment related bug on linux, which is mysteriously less a problem on other unix's. Which is sorta funny given how pragmatic Linus and the kernel are about fixing POSIX bugs by making them not happen, while glibc is still lagging here decades after people tried to at least make the problem better. Sure there is all the crap around TZ/etc, but simply providing getenv_r() and synchronizing it with setenv() and warning during compile/link on getenv() would have killed much of the problem. Nevermind, actually doing a COW style system where the env pointer(s) are read only. Instead the problem is pushed to the individual application, which is a huge mistake, because application writers are rarely aware of what their dependencies are doing. Which is the situation I found myself in many many years ago. The closed source library vendor, at the time, told us to stop using that toy unix clone (linux).

kelnos1y ago

> environment related bug on linux, which is mysteriously less a problem on other unix's.

How do you figure? The problem isn't the implementation, it's the API. setenv(), unsetenv(), putenv(), and especially environ, are inherently unsafe in a multithreaded program. Even getenv_r() can't really save you, since another thread may be calling setenv() while the (old) value of an env var is being copied into the provided buffer. Sure, a getenv_r() fixes the case where you get something back from getenv(), and then another thread calls setenv() and makes that memory invalid, but there's no way to protect the other calls breaking the API.

There are ways to mitigate some of the issues, like having libc hold a mutex when inside getenv()/setenv()/putenv()/unsetenv(), but there's still no way for libc to guarantee that something returned by getenv() remains valid long enough for the calling code to use it (which, right, can be fixed by getenv_r(), which could also be protected by that mutex). But there's no good way to make direct access to environ safe. I suppose you could make environ a thread-local, but then different threads' views of the environment could become out of sync, permanently (and you could get different results between calling getenv_r() and examining environ directly).

Back-compat here is just really hard to do. Even adding a mutex to protect those functions could change the semantics enough to break existing programs. (Arguably they're already broken in that case, but still...)

depr1y ago

>> environment related bug on linux, which is mysteriously less a problem on other unix's.

> How do you figure?

From https://illumos.org/man/3C/putenv:

> The putenv() function can be safely called from multithreaded programs

anonfordays1y ago

Considering this is a libc issue, not a Linux specific one, I wonder how thread safe other libc implementations like musl and Bionic are. How do the BSDs stack up? Humorously, illumos also ships with glibc...

rerdavies1y ago

Why does adding a mutex break the API? I guess it breaks `char**environ`. But the API wouldn't be broken.

benmmurphy1y ago

I think you would have to change the API to return a copy of the string as the get_env result which the caller is responsible for free-ing or the env implementation would have to ensure returned values from get_env are stable and never change which is effectively a memory leak.

1 more reply

einpoklum1y ago

> Even getenv_r() can't really save you, since another thread may be calling setenv() while the (old) value of an env var is being copied into the provided buffer.

Won't that depends on the libc implementation. For example, maybe setenv writes to another buffer, then swaps pointers atomically; wouldn't that work?

masklinn1y ago· 6 in thread

Previously on setenv being a terrible thing: https://www.evanjones.ca/setenv-is-not-thread-safe.html (discussion: https://news.ycombinator.com/item?id=38342642 first comment is even about it causing issues in Rust)

Animats1y ago

Yes. That's known.

Most of the rest of the problem here seems to be the development environment. They're testing on a remote machine in an Amazon data center and using Docker. This rig fails to report that a process has crashed. Then they don't have enough debug symbol info inside their container to get a backtrace. If they'd gotten a clean backtrace reported on the first failure, this would have been obvious.

Why is anyone using "setenv" anyway?

mmastrac1y ago

Yup, it's mostly just the story and tools we used to get ourselves out of a mess that was made harder by some decisions made earlier -- the tests were running in a container with stripped symbols (we're going to ship symbols after this, no reason to over-optimize), our custom test runner failed to report process death (an oversight).

There's no reason setenv should have been called here. The `openssl-probe` library could simply return the paths to the system cert files and callers could plug those directly into the OpenSSL config.

Oversights all around and hopefully this continues to improve.

mark_undoio1y ago

> Yup, it's mostly just the story and tools we used to get ourselves out of a mess that was made harder by some decisions made earlier -- the tests were running in a container with stripped symbols (we're going to ship symbols after this, no reason to over-optimize)

It's worth noting here that you can also build your binaries and keep debug symbols separately.

You don't need to ship them with the binary (although it will make many scenarios a bit simpler if you do, since you'll always have the right ones available).

Some info that might help: https://www.tweag.io/blog/2023-11-23-debug-fission/ https://undo.io/resources/gdb-watchpoint/reduce-binary-size-...

nemetroid1y ago

> we're going to ship symbols after this, no reason to over-optimize

You might want to look into debuginfod.

masklinn1y ago

> Why is anyone using "setenv" anyway?

Because it’s there and it looks like a good idea until it takes one of your fingers.

einpoklum1y ago

It really does not look like a good idea to setenv() . The very notion is quite terrifying. Messing with a bunch of globals, that other code knows about as well? Nuh-uh.

The thing is, the OP people weren't doing that at all, it was some irresponsible library maintainers. If your code does that, you have to include something like the "surgeon general's warning" everywhere: "CAREFUL: USING THIS LIBRARY MAY CAUSE TERMINAL CRASHES".

3 more replies

nwellnhof1y ago· 6 in thread

> Our nightly CI machines run on Amazon AWS, which has the advantage of giving us a real, uncontainerized root user.

> We don’t have the necessary files outside of the container, and our containers are quite minimal and don’t allow us to easily install gdb.

Have people lost the ability to build and debug their code locally, without clouds and containers?

api1y ago

Yes. It’s shocking just how much cloud SaaS has distorted peoples understanding of things. You need all kinds of layers of cloud complexity and deployment to do the most trivial stuff. We have 100% reversed the PC revolution and returned to the era of clunky expensive mainframe computing.

The reason is that cloud is where all the money is because cloud is DRM. Put software there and you can charge a subscription and nobody can evade it and you have perfect lock in forever. People usually can’t even get their data out. You can also do all kinds of realtime analytics conveniently to optimize your product.

Computing architecture is downstream of the business model. Mainframe died originally because there was no Internet and PCs were cheaper, but vendors also lost a lot of their lock in power. Now they have a way to bring a model that is much more profitable back. No more pesky freedom for users, who to be fair if given such freedom will often just refuse to pay, making quality software a non-viable business.

Tangent I know.

bluGill1y ago

There is a lot to like about the clould model as a user. I can access my data where ever I am, from what ever device I have, and I won't lose it to a disc crash.

there are faults to the cloud but it solves real problems users have.

api1y ago

There are other ways that could be achieved, like cloud storage constantly mirroring local but encrypted with local keys or keys controlled by the user.

This is the iCloud model and it works. Imagine a more open version with competing storage providers.

This, however, would hand control back to the user, which would be bad for the software industry with its addiction to lock in and recurring revenue.

2 more replies

bluGill1y ago

This is a random trash only on arm. I doubt they could get the crash to happen locally - most likely their developer machines were all x86 where it never crashed.

they should have handled crashes better - a problem they seem to recognize but not the issue here so not covered.

msully4321OP1y ago

> Have people lost the ability to build and debug their code locally, without clouds and containers?

No, of course not, but it didn't crash on our machines!

mardifoufs1y ago

How would you debug locally when you probably don't have a device that runs the arch that is causing an issue? It's much faster to just debug in the actual environment where the failure happens anyways.

kelnos1y ago· 5 in thread

This reminded me of that whole "12-factor app" movement, which several of my former coworkers had really bought into. One of the "factors" is that apps should be configured by environment variables.

I always thought this was kinda foolish: your configuration method is a flat-namespace basked of stringly-typed values. The perils of getenv()/setenv()/environ are also, I think, a great argument against using env vars for configuration.

Sure, there aren't always great, well-supported options out there. I prefer using a configuration file (you can have templated config and a system that fills in different values for e.g. dev/stage/prod), and I'll usually use YAML, despite its faults and gotchas. There are probably better configuration file formats, but IMO YAML is still significantly better than using env vars.

shortrounddev21y ago

I often find that there's a lot of intense animosity towards windows and Microsoft, but a lot of their API design is vindicated by time. Environment variables can be typed and templated in NT, not to mention there's a namespaced config database (the registry, even if it's really verbose and strange). Plus msvc provides threadsafe versions of nearly every stslib function. I often hear new C/C++ developers lament the lack of POSIX compatibility with MSVC, but without a lot of consideration for what that actually means; they just want cross compatibility with C programs written in the 1990s

__MatrixMan__1y ago

I have similar reservations about env vars. I dislike how they can be read from anywhere--it interrupts the ability to reason about a function's behavior from its signature and makes impure plenty of functions that could otherwise have been pure.

If there were a language feature that let me mark apps such that during any process env vars are not writable and are readable only once (together, in a batch, not once per var), I'd use it everywhere.

eqvinox1y ago

getenv() is perfectly fine, it's setenv() that is the problem. Which in theory this wouldn't be using since the env would be set up prior to starting that mystical app.

But yes, a flat namespace, with string values, shared as a free-for-all with who knows what libraries and modules you're loading… that's not a good idea even if it didn't have safety issues in setenv().

jillesvangurp1y ago

There probably should be an addendum to the "12-factor app" movement that says that the environment should be treated as read only for the duration of the process. Most of the issues people talk about here seem to relate to people trying to abuse the environment as some kind of key value store for mutable global state (which sounds like a bad idea). Why would you even want to do that?!

Being on the JVM which actually treats the environment as immutable that and which probably inspired a lot of the 12 factor app movement (with companies like Soundcloud being big Scala and Java users and pushing this), I've never experienced any issues with the environment changing on me or causing any threading issues. The environment is effectively immutable and there's nothing in my processes that sneakily circumvents that (via some native calls into libc). So, complete non issue on the JVM.

Even if somebody manages to modify the environment, the immutable copy stays the same. That copy gets created on JVM startup and is immutable. Anything using normal Java apis to interact with the environment will never see the modification. I'm sure people might have tried to work around that but it's not a wide spread practice. Because, again, why would you even want to do that?

The problem with configuration files is that their parsing is process specific. That's why Linux/Unix is such a mess. Every single tool seems to have its own conventions and mechanisms for configuration. There are no standards for this.

Other of course than the Docker ecosystem. You can do whatever you want inside the container but effectively your only interface to the outside world is either messily mounting some volume and doing whatever convoluted way of configuration your app requires; or just using environment variables. Most modern software is docker ready/friendly in the sense that you can fully control their behavior via the environment. It's perfectly adequate for most things that people run via docker these days. Which of course is pretty much anything.

And of course with Docker compose or kubernetes (which I'm not necessarily a fan of) you get yaml files defining lists of environment variables that define how your process starts. So you more or less get what you are asking for. I'm not a big YAML fan but it works well enough. Too much potential for syntax issues really ruining your day IMHO. But it's not like the alternatives are free of issues.

johnny221y ago

This is unrelated really. If you read your enviornment variables into config and never touched them again, then you're totally safe.

I personally use 12 factor app style, but once it's entered the app I validate the env variables and data and then store them. It's totally fine after that.

HarHarVeryFunny1y ago· 5 in thread

What is the rationale for libc not making setenv/getenv thread safe? It does seem rather odd given how environment variables are explicitly defined as shared between threads in the same process!

It doesn't seem it would take much to do it efficiently, even retaining the poor getenv() pointer-returning API (which could point to a thread local buffer). The coordination between getenv and setenv could be very lightweight - spinlock vs mutex.

jeroenhd1y ago

The spec says it's not supposed to be thread safe.

There's also no real backwards compatible way of fixing setenv(). getenv() returns a pointer that can be read at any time, and then there's the *environment parameter that can also be used to read env variables.

IMO the entire API should be deprecated for a thread safe one, but until someone comes with a standard setenv() alternative that's implemented by the libc runtimes, we'll be stuck with the shitty POSIX API, and every year we will read blog posts about get/setenv() crashing processes.

4gotunameagain1y ago

I think the argument was that the standard states that setenv is not thread safe, although from what I see it says that it does not have to be thread safe:

  The setenv( ) function need not be thread-safe. A function that is not required to be thread-safe is not required to be reentrant.

https://www.open-std.org/jtc1/sc22/open/n4217.pdf.

Page.. 1860 :')

HarHarVeryFunny1y ago

Sure, but given that Linux defines the environment as state that's shared between threads, not having a thread-safe way of accessing it is hard to defend...

Is "the standard says it doesn't NEED to be thread safe" the argument that the Linux libc maintainers are using for not enhancing it to be thread safe, or is it based on some technical or backwards compatibility issues in doing so ?

debugnik1y ago

The only thread-safe way to implement getenv/setenv as they currently exist is to leak the previous state when setenv allocates, such that existing pointers stay valid. The existing API simply lacks a mechanism to synchronize correctly.

Leaking would be good enough for many use cases, but it would break long-running users of setenv (mainly those with libraries abusing env vars, as in TFA), and doesn't even solve how they interact with putenv and environ. This whole API is just cursed.

Libc could of course get better APIs, like GetEnvironmentVariable on Windows, but that won't fix all existing code.

1 more reply

saagarjha1y ago

The rationale is that it was implemented before threads existed, and now can't be retrofitted with thread safety.

vrtx01y ago· 5 in thread

Let me try to help:

1. If a process crashes and dumps, be sure to look at the system log of the cause (e.g. SIGSEGV, OOM, invalid instruction, etc.)

2. Be certain you’re looking at the right core dumps — I believe UID 1000 just means posix UserID (which is unrelated to a PID), though I don’t use containers.

3. Stay focused on the right level of abstraction — memory model details are great to know, but irrelevant here.

4. Variables do not correlate 1:1 with registers, except in C calling conventions. The assumption about x20 and a local variable is incorrect, unfortunately.

5. getenv() and setenv() do not work as implied in the post. When a process starts via execve(), the OS/libc constructs a new snapshot of the environment, and cannot be modified by an ancestral process. It’s a snapshot in time, unless updated by the process itself. When a process fork()s, the child gets a new copy of the parent’s environment — updates do not propagate.

getenv() is thread safe and reentrant. You don’t use an environment to pass shared data — setenv() is generally used when constructing the environment for a child process before a fork(). See man environment.

6. FWIW, ‘char** env’ is a null-terminated array of pointers, so dumping memory from *env (or env[0]) is only valid until you hit the first NULL. The size of the array is not stored in the array.

I hope this helps! And apologies if this is redundant — I read so many comments; mostly variations of “the problem with getenv is x”, but gave up before reading all of the (currently) 168 comments.

saagarjha1y ago

I'm kind of confused by this response. It doesn't seem to match the actual article? For example, they consulted the code to find what x20 had in it, rather than blindly guessing. Doing that is perfectly fine and even desirable when analyzing crashes. There is no forking mentioned. People call setenv all the time when trying to modify their own environment (hence the crashes!). Nobody said anything about the size of env.

vrtx01y ago

x20 is a general purpose register; optimizing compilers can use it for any number of variables, immediate values or intermediate computations at different points within that same function — or none at all (the variable ep could be optimized away).

Re: fork(), I just meant to be thorough in explaining the environment is copied, not shared by processes. Setenv() only affects the process from which it’s called.

The array size bit in the article: The value 0x220 looks suspiciously close to the size of the old environment in 64-bit words (0x220 / 8 = 68), and this value was written over the terminating NULL of the environment block…

HTH!

saagarjha1y ago

No, it does not. I don't think you understand what you are talking about, because none of these actually address the points I brought up. They use the same words, but semantically they are talking about something completely different.

1 more reply

swiftcoder1y ago

> I hope this helps!

It does not help, because you do not appear to have understood the article (or even read it all that closely).

Some of these bullet points feel a lot like the kind of junk output one sees from the various (popular, but flawed) AI summary tools...

vrtx01y ago

So, I’m real, and just trying to offer constructive feedback for a few errors I believe I noticed.

I could be wrong though —- could you be specific? I don’t want to misinform anyone…

1 more reply

colonial1y ago· 2 in thread

TIL that my set_env("RUST_LOG"...) calls at startup are technically unsafe. Funny.

I should see if the env_logger crate has a better solution.

loeg1y ago

At startup it's probably fine! It's safe in a single-threaded environment.

kurante1y ago

As long as they don't use `#[tokio::main]` or any other attribute that wraps main into an async function!

gavinhoward1y ago· 2 in thread

It is weird that I got this right before Rust did.

Because I use structured concurrency, I can make it so every thread has its own environment stack. To add to a new environment, I duplicate it, add the new variable, and push the new enviroment on the stack.

Then I can use code blocks to delimit where that stack should be popped. [1]

This is all perfectly safe, no `unsafe` required, and can even extend to other things like the current working directory. [2]

IMO, Rust got this wrong 10 years ago when Leakpocalypse broke. [3]

[1]: https://git.yzena.com/Yzena/Yc/src/branch/master/tests/yao/e...

[2]: https://gavinhoward.com/2024/09/rewriting-rust-a-response/#g...

[3]: https://gavinhoward.com/2024/05/what-rust-got-wrong-on-forma...

mmastrac1y ago

This isn't _really_ a Rust problem. Rust is a victim of POSIX.

If you have 1) C FFI interop in Yao, there's still a chance you might have two C libraries cause a crash without your code even being involved.

gavinhoward1y ago

Except if there is dymanic linking, I can use that to inject my own setenv and getenv, just like people inject jemalloc or other malloc alternatives.

hauntsaninja1y ago· 1 in thread

We had so many of these issues that we ended up LD_PRELOAD-ing patch getenv / setenv / putenv

msully4321OP1y ago

With a fixed implementation that leaks environments (like the one that just landed in glibc)?

einpoklum1y ago· 1 in thread

A function which sets global process state is not thread safe? Why, I'm shocked; shocked and chagrined.

But really, I don't understand why a sensitive security-related library would implicitly use an unsafe function like setenv().

bangaladore1y ago

> A function which sets global process state is not thread safe? Why, I'm shocked; shocked and chagrined.

This is a oversimplification. Windows has essentially the exact same API and it works just fine in multithreaded contexts.

The issue here is unix allows the underlying pointer to be accessed, bypassing any possible thread-safe APIs.

lopkeny12ko1y ago· 1 in thread

The whole point of Rust is memory safety, not thread safety...

masklinn1y ago

Rust literally bakes data race safety into the language. While it does not resolve general race conditions, thread safety issues which cause memory unsafety (which an UAF or dangling pointer would be) are very much within its remit.

rikthevik1y ago

Great article about digging into a non-obvious bug. This one had it all! Intermittent bug, architecture-specific, hidden in a dependency, rust, the python GIL, gettext. Fantastic stuff.

These kinds of detailed troubleshooting reports are the closest thing you can get to having to do it yourself. Thanks to the authors. It's easy to say "don't use X duh" until a dependency relies on it, and how were you supposed to know?

datadeft1y ago

Couldn't we have a better pattern for this?

    if (__environ == NULL || name[0] == '\0')
      return NULL;

cuno1y ago

We ended up overriding and replacing with our own thread-safe version years ago when we also hit this.

janmatejka1y ago

This reminds of the time I was not able to get setproctitle to work in certain code base. Eventually I narrowed the issue to this line:

  import numpy

setproctitle() worked before numpy import but not after because it couldn't find the memory address of **environ.

I'm hazy on the details but it led me to a somethingenv call (possibly getenv or setenv) in numpy initialization and it turned out that function changed the address of **environ and that was the reason for why setproctitle couldn't find it.

loeg1y ago

env::set_var is marked unsafe now: https://doc.rust-lang.org/std/env/fn.set_var.html

And:

> This function is safe to call in a single-threaded program.

> This function is also always safe to call on Windows, in single-threaded and multi-threaded programs.

> In multi-threaded programs on other operating systems, the only safe option is to not use set_var or remove_var at all.

Meneth1y ago

From the backtrace, it seems strerror_r is not thread-safe, since it calls __dcigettext which calls getenv.

A similar bug related to setlocale was found in 2007 and fixed in 2014. That bug did not take getenv/setenv into account. https://sourceware.org/bugzilla/show_bug.cgi?id=5443

kazinator1y ago

This is not just a thread issue!

You run into a problem if you keep using a string returned by getenv after calling another environment function: including possibly getenv itself!

However, it's easy to just strdup the result of getenv; that defends against the issue in a single-threaded program.

roca1y ago

Switching from OpenSSL to rustls solves even more problems than expected.

up2isomorphism1y ago

Does posix say setenv us thread safe? If not, why complain about it?

throwaway20371y ago

Click bait title? GLibC is very clear about what is and what is not thread-safe. I looked at the article: They fell victim to the classic getenv()/setenv() trap. This has been blogged about many times. If you look at the man page for setenv():

Ref: https://man7.org/linux/man-pages/man3/setenv.3.html

... it clearly says: "MT-Unsafe"

Also, there is a whole section about get/set env thread safety here (under "Other safety remarks -> env"):

https://man7.org/linux/man-pages/man7/attributes.7.html

wakawaka281y ago

Sounds like you just didn't know it's not threadsafe. This is common knowledge in the C and C++ world.

j / k navigate · click thread line to collapse

362 comments

167 comments · 28 top-level

mmastrac1y ago· 33 in thread

usefulcat1y ago

But that won't actually fix the underlying problem, namely that getenv and setenv (or unsetenv, probably) cannot safely be called from different threads.

It seems like the only reliable way to fix this is to change these functions so that they exclusively acquire a mutex.

eqvinox1y ago

And, lastly, compatibility with ancient systems strikes again: the environment is also accessible through this:

   extern char **environ;

Which is, of course, best described as bullshit.

diroussel1y ago

Indeed, environment variables should be used to configure child processes, not to configure the current process, for non-shell programs, IMHO.

Note that Java, and the JVM, doesn't allow changing environment variables. It was the right choice, even if painful at times.

4 more replies

harrall1y ago

You can’t convince me that there is EVER a reason to call setenv() after program init as part of a regular program, outside needing to hack around something specific.

Environmental variables are not a replacement for your config. It’s not a place to store your variables.

2 more replies

lamontcg1y ago

5 more replies

stouset1y ago

> As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv.

This holds for a lot of programs, but what if you're writing a shell?

3 more replies

kazinator1y ago

The underlying problem isn't just setenv, because the string returned by getenv can be invalidated by another call to getenv. ISO C says:

In a single threaded virtual machine, you can immediately duplicate the string returned by getenv and stop using it, right there.

Under threads, getenv is not required to be safe.

I think that with some care, it may be; an environment implementation could guarantee that a non-mutating operation like getenv doesn't invalidate any previously returned strings.

I think POSIX does that. It allows getenv to reallocate the environ array, but not the strings themselves:

environ is documented together with the exec family of functions; that's where this is found.

So whereas there are things not to like about environ, it can be the basis for thread safety of getenv in an application that doesn't mutate the environment.

Joker_vD1y ago

> As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv.

Mutating argv is actually quite popular, or at least it used to be.

2 more replies

debugnik1y ago

So setenv's existence makes getenv inherently unsafe unless you can ensure the entire application is at a safe point to use them.

tsimionescu1y ago

This is actually not that hard to fix.

4 more replies

kazinator1y ago

josefx1y ago

1 more reply

xxs1y ago

They can have copy on write of course.

Ferret74461y ago

pshc1y ago

The underlying problem is that setenv is mutable global state and should never have existed

3 more replies

kazinator1y ago

The mutex would have to be held by the caller until it no longer needs the string returned from the environment, or makes a copy:

   stdenvlock();    // imaginary function added to ISO C or POSIX
   char *home = getenv("HOME");
   char *home_copy = strdup(home);
   stdenvunlock();  // only here can we unlock
   // home pointer is now indeterminate

Other solutions:

1. Put the above sequence into a function, and don't expose the mutex. Thread-safe code must use:

   char *home = dupenv("HOME"); // imaginary function; caller responsible for freeing.

2. Provide environment lookup into a buffer:

   getenvbuf("HOME", mybuf, sizeof mybuf);  // returns some value that helps to resize the buffer

All functions that retain pointers out of the classic getenv remain unsafe.

A mutex can be provided to those applications that want to manipulate the environ array directly, or use getenv and setenv, or any combinations of these.

The main problem is all the code out there using getenv.

liontwist1y ago

Please no.

If your program wants to use the environment as an out-of-band global var for cross thread communication, you can make your own mutex.

1 more reply

ModernMech1y ago

eqvinox1y ago

This is neither an OS nor a machine scope problem. The environment is provided by the OS at startup. What the process does with it from there on is its own concern.

1 more reply

belter1y ago

> It seems like the only reliable way to fix this is to change these functions so that they exclusively acquire a mutex.

A mutex can ensure thread safety but risks deadlocks if not used carefully and will hurt performance...

hamandcheese1y ago

Agree about performance, but wouldn't there need to be >1 mutex to risk a deadlock?

2 more replies

fch421y ago

goeiedaggoeie1y ago

setenv and getenv have never been thread safe, why the concern with it now?

1 more reply

loeg1y ago

Is that the underlying problem, or is the underlying problem that libraries are using thread-unsafe setenv in threaded contexts when they could just do something else?

db48x1y ago

But it would force Rust programs to add their own synchronization mechanism around them. As long as no two threads can call getenv/setenv at the same time then it’s fine.

kibwen1y ago

The problem isn't something that Rust can solve.

1 more reply

thayne1y ago

In particular, it doesn't help if you call a c function that indirectly modifies the environment with FFI.

zamalek1y ago

> Nowadays the best solution to this issue is "stop using this crate" with libraries like rustls.

Nice to see that the author of the library has a sensible take. Unfortunately the ecosystem does not: https://github.com/seanmonstar/reqwest/blob/master/Cargo.tom...

benatkin1y ago

People get trained to ignore the ____UNSAFE_payattention__nevermindthatthisappears50timesinthisfile___ blocks and prefixes

This also shows up in web frameworks where Vue has the v-html directive and react has dangerouslySetInnerHTML. Vue definitely has it better.

crooked-v1y ago

javier21y ago

rerdavies1y ago

Code syntax highlighting libraries for react use dangerouslySetInnerHTML.

benatkin1y ago

1 more reply

forrestthewoods1y ago· 20 in thread

Mutable global state is evil. Friends don’t let friends use mutable global state.

I hate envvars. It’s “the Linux way”. I avoid them like the plague. A++ strong recommend.

libc is terrible. The world needs to move on.

01HNNWZ0MV43FF1y ago

Env vars are good if you treat them as read-only within the process

forrestthewoods1y ago

I’ll take a config file over an envvar 100% of the time.

msully4321OP1y ago

Yeah, setenv should probably just not exist, and environment variables should be only set when spawning new processes.

2 more replies

maep1y ago

> Mutable global state is evil. Friends don’t let friends use mutable global state.

Throw away your CPU and RAM then.

snowfarthing1y ago

incrudible1y ago

Your CPU has an MMU in order to (among other things) let the OS prevent mutable global state.

forrestthewoods1y ago

I can not possibly roll my eyes hard enough.

titzer1y ago

And disks. And the cloud. Or basically, you know, computers.

kibwen1y ago

Don't threaten me with a good time.

incrudible1y ago

Ah yes, the cloud where we all happily share compute resources without any restrictions to avoid stomping on each others toes.

layer81y ago

The universe, you mean.

sim7c001y ago

what do you suggest as alternative?

the problem is not linux, not mutable global state or resources and not libc.

there is too much money in halfbaked code. sad but true.

viraptor1y ago

sim7c001y ago

just because someone makes a window doesn't mean you gotta jump out of it. there are good and bad uses for things, and the bad ones should be avoided lest one hurt themselves?

1 more reply

glouwbug1y ago

libc moved the world into the Information Age

kibwen1y ago

In the same way that Yersinia pestis moved the world into the Renaissance?

glouwbug1y ago

Yes, neither were memory or thread safe

jimbob451y ago

What's your preferred alternative?

andrewmcwatters1y ago

Don’t use a mouse or a monitor then.

snowfarthing1y ago

shikon71y ago· 18 in thread

I wonder why it is so hard for Rust to implement its own safe stdlib independent of C.

dgrunwald1y ago

How exactly would that help in this situation?

rcxdude1y ago

If there was a libc implemented in rust (like https://github.com/redox-os/relibc), you could use that for the C code in the process, and you'd be sharing the relevant state.

do_not_redeem1y ago

steveklabnik1y ago

Linux is an unusual platform in that it allows you to call into it via assembly. Most other platforms require you to go through libc to do so. It's not really in Rust's hands.

PaulDavisThe1st1y ago

This is not unusual at all. Windows allowed it for years before Linux came along. It was also true of some other *nix systems - IIRC, Ultrix (DEC) allowed this, and so did Dynix (Sequent).

*BSD allows it too, or used as of 2022.

What is unusual about Linux is that it guarantees a syscall ABI, meaning that if you follow it, you can make a system call "portably" across "any" version of Linux.

1 more reply

kbolino1y ago

They did, it's called core. But it assumes no operating system at all, and environment variables require an operating system.

nomel1y ago

> and environment variables require an operating system

kbolino1y ago

1 more reply

panzi1y ago

sunshowers1y ago

Environment variables are not just technical, they're social. You need to get everyone on board with your scheme.

xxs1y ago

>and environment variables require an operating system.

Yes to read them, if Rust wish to modify - modify your own, already copied structure. I'd do that in pretty much any language.

zanderwohl1y ago

IshKebab1y ago

"Impossibru!!"

https://github.com/sunfishcode/eyra

Oh look:

> Why use Eyra? It fixes Rust's set_var unsoundness issue. The environment-variable implementation leaks memory internally (it is optional, but enabled by default), so setenv etc. are thread-safe.

zanderwohl1y ago

> Why not use Eyra?

Well, that's a lot of caveats. As I said, it would take years to complete. And it looks like it's well on its way but not near complete.

dvtkrlbs1y ago

If I understand it correctly this still doesnt help with downstream dependencies.

kbolino1y ago

That's quite a trade-off

2 more replies

sunshowers1y ago

That only works on Linux though right?

v3xro1y ago

There are rust libc implementations e.g. one by Redox: https://gitlab.redox-os.org/redox-os/relibc

ChrisSD1y ago· 13 in thread

https://doc.rust-lang.org/stable/std/env/fn.set_var.html

jrmg1y ago

Wow, glibc now

There have got to be pathalogical uses out there where this will cause unbounded memory growth in well-formed (according to the API) programs, no?

GoblinSlayer1y ago

FWIW you can make a singly linked list with infinite number of nodes too. Memory leaks happen in well formed programs just fine, glibc is just one of many examples.

Thaxll1y ago

Why requiring unsafe when the std implementation could take care of the synchronisation?

masklinn1y ago

rerdavies1y ago

But the standard implementation could copy the environment at startup, and only uses its copy.

And the library's use of setenv is clearly a bug as setenv is documented to be not threadsafe in the C standard library. So that would take care of that problem.

1 more reply

miohtama1y ago

Is it possible to skip libc completely or would this introduce too many portability concerns?

3 more replies

ChrisSD1y ago

The only way to coordinate locking would be to do so in libc itself.

wahern1y ago

1 more reply

demurgos1y ago

It can't ensure synchronization because any code using libc could bypass the sync wrapper. In particular, Rust lets you link C libs which wouldn't use the Rust stdlib.

msully4321OP1y ago

Because it can still race with C code using the standard library. getenv calls are common in C libraries; the call to getenv in this post was inside of strerror.

fsckboy1y ago

you've gotten a lot of answers which say the same thing, but which I don't think answer your question:

synchronization methods impose various complexity and performance penalties, and single threaded applications which don't need that would pay those penalties and get no benefit.

davidt841y ago

The real problem is that getenv() and setenv() were created before threads were really a thing.

1 more reply

sunshowers1y ago

vlovich1231y ago· 8 in thread

I'm also not convinced by Musl's maintainer that it can't be fixed within Musl considering glibc is making changes to make this a non-issue.

usefulcat1y ago

The biggest problem is not the absence of a thread safe API, it's the existence of this:

    extern char **environ;

As long as environ is publicly accessible, there's no guarantee that setenv and getenv will be used at all, since they're not necessary.

vlovich1231y ago

> aka don't let the perfect be the enemy of the good

Exactly my point. Over time *environ would disappear, at least from the major software projects that everyone uses (assuming it's even in use in them in the first place).

aragilar1y ago

1 more reply

IshKebab1y ago

panzi1y ago

Guess that would also require some locking for all the exec() functions that don't take the environment as a parameter or that search PATH for the executable.

davidt841y ago

I'm not convinced by you that you know more than the experts who have determined there is no backwards-compatible way to fix this.

vlovich1231y ago

I'll take existence proofs [1] over personal insults but YMMV. You also may want to be careful assuming the expertise of people on this forum. Some people here are quite technical.

[1] https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f...

davidt841y ago

That isn't thread safe, it's safER.

I am also quite technical, thanks.

1 more reply

jandrese1y ago· 7 in thread

umpalumpaaa1y ago

The man page says:

> POSIX.1 does not require setenv() or unsetenv() to be reentrant.

A non-reentrant function cannot be thread safe.

In general (for POSIX, libc and many other libraries: if the docs do not explicitly say "this function is thread safe" they are not).

wmf1y ago

It's time to move beyond this attitude and make things safe by default. For example, Solaris has a safer version of setenv().

3 more replies

jabl1y ago

> A non-reentrant function cannot be thread safe.

Actually, a non-reentrant function can be thread-safe. A common example of such a function in libc being malloc().

adrian_b1y ago

By definition, a "reentrant function" is a function that may be invoked even when it has not returned yet from a previous invocation.

So a non-reentrant function is a function that may not be invoked again between a previous invocation and returning from that invocation.

When a function may be invoked from different threads, then it is certain that sometimes it will be invoked by a thread before returning from a previous invocation from a different thread.

An implementation of "malloc" may be reentrant or it may be non-reentrant.

Old "malloc" implementations were usually non-reentrant because they used global variables for managing the heap. Such "malloc" functions could not be used in multi-threaded programs.

2 more replies

sumtechguy1y ago

I could be wrong but isnt that because each thread has its own heap?

01HNNWZ0MV43FF1y ago

Funny enough, the Rust wrapper `std::env::set_var` does have a big warning https://doc.rust-lang.org/std/env/fn.set_var.html

subarctic1y ago

Looks like that Safety section was added in 1.76.0. It'll be an even bigger warning in the future since it's now going to be unsafe in Rust 2024

StillBored1y ago· 6 in thread

kelnos1y ago

> environment related bug on linux, which is mysteriously less a problem on other unix's.

depr1y ago

>> environment related bug on linux, which is mysteriously less a problem on other unix's.

> How do you figure?

From https://illumos.org/man/3C/putenv:

> The putenv() function can be safely called from multithreaded programs

anonfordays1y ago

rerdavies1y ago

Why does adding a mutex break the API? I guess it breaks `char**environ`. But the API wouldn't be broken.

benmmurphy1y ago

1 more reply

einpoklum1y ago

> Even getenv_r() can't really save you, since another thread may be calling setenv() while the (old) value of an env var is being copied into the provided buffer.

Won't that depends on the libc implementation. For example, maybe setenv writes to another buffer, then swaps pointers atomically; wouldn't that work?

masklinn1y ago· 6 in thread

Animats1y ago

Yes. That's known.

Why is anyone using "setenv" anyway?

mmastrac1y ago

Oversights all around and hopefully this continues to improve.

mark_undoio1y ago

It's worth noting here that you can also build your binaries and keep debug symbols separately.

You don't need to ship them with the binary (although it will make many scenarios a bit simpler if you do, since you'll always have the right ones available).

Some info that might help: https://www.tweag.io/blog/2023-11-23-debug-fission/ https://undo.io/resources/gdb-watchpoint/reduce-binary-size-...

nemetroid1y ago

> we're going to ship symbols after this, no reason to over-optimize

You might want to look into debuginfod.

masklinn1y ago

> Why is anyone using "setenv" anyway?

Because it’s there and it looks like a good idea until it takes one of your fingers.

einpoklum1y ago

It really does not look like a good idea to setenv() . The very notion is quite terrifying. Messing with a bunch of globals, that other code knows about as well? Nuh-uh.

3 more replies

nwellnhof1y ago· 6 in thread

> Our nightly CI machines run on Amazon AWS, which has the advantage of giving us a real, uncontainerized root user.

> We don’t have the necessary files outside of the container, and our containers are quite minimal and don’t allow us to easily install gdb.

Have people lost the ability to build and debug their code locally, without clouds and containers?

api1y ago

Tangent I know.

bluGill1y ago

There is a lot to like about the clould model as a user. I can access my data where ever I am, from what ever device I have, and I won't lose it to a disc crash.

there are faults to the cloud but it solves real problems users have.

api1y ago

There are other ways that could be achieved, like cloud storage constantly mirroring local but encrypted with local keys or keys controlled by the user.

This is the iCloud model and it works. Imagine a more open version with competing storage providers.

This, however, would hand control back to the user, which would be bad for the software industry with its addiction to lock in and recurring revenue.

2 more replies

bluGill1y ago

This is a random trash only on arm. I doubt they could get the crash to happen locally - most likely their developer machines were all x86 where it never crashed.

they should have handled crashes better - a problem they seem to recognize but not the issue here so not covered.

msully4321OP1y ago

> Have people lost the ability to build and debug their code locally, without clouds and containers?

No, of course not, but it didn't crash on our machines!

mardifoufs1y ago

kelnos1y ago· 5 in thread

This reminded me of that whole "12-factor app" movement, which several of my former coworkers had really bought into. One of the "factors" is that apps should be configured by environment variables.

shortrounddev21y ago

__MatrixMan__1y ago

eqvinox1y ago

getenv() is perfectly fine, it's setenv() that is the problem. Which in theory this wouldn't be using since the env would be set up prior to starting that mystical app.

jillesvangurp1y ago

johnny221y ago

This is unrelated really. If you read your enviornment variables into config and never touched them again, then you're totally safe.

I personally use 12 factor app style, but once it's entered the app I validate the env variables and data and then store them. It's totally fine after that.

HarHarVeryFunny1y ago· 5 in thread

What is the rationale for libc not making setenv/getenv thread safe? It does seem rather odd given how environment variables are explicitly defined as shared between threads in the same process!

jeroenhd1y ago

The spec says it's not supposed to be thread safe.

4gotunameagain1y ago

I think the argument was that the standard states that setenv is not thread safe, although from what I see it says that it does not have to be thread safe:

  The setenv( ) function need not be thread-safe. A function that is not required to be thread-safe is not required to be reentrant.

https://www.open-std.org/jtc1/sc22/open/n4217.pdf.

Page.. 1860 :')

HarHarVeryFunny1y ago

Sure, but given that Linux defines the environment as state that's shared between threads, not having a thread-safe way of accessing it is hard to defend...

debugnik1y ago

Libc could of course get better APIs, like GetEnvironmentVariable on Windows, but that won't fix all existing code.

1 more reply

saagarjha1y ago

The rationale is that it was implemented before threads existed, and now can't be retrofitted with thread safety.

vrtx01y ago· 5 in thread

Let me try to help:

1. If a process crashes and dumps, be sure to look at the system log of the cause (e.g. SIGSEGV, OOM, invalid instruction, etc.)

2. Be certain you’re looking at the right core dumps — I believe UID 1000 just means posix UserID (which is unrelated to a PID), though I don’t use containers.

3. Stay focused on the right level of abstraction — memory model details are great to know, but irrelevant here.

4. Variables do not correlate 1:1 with registers, except in C calling conventions. The assumption about x20 and a local variable is incorrect, unfortunately.

6. FWIW, ‘char** env’ is a null-terminated array of pointers, so dumping memory from *env (or env[0]) is only valid until you hit the first NULL. The size of the array is not stored in the array.

saagarjha1y ago

vrtx01y ago

Re: fork(), I just meant to be thorough in explaining the environment is copied, not shared by processes. Setenv() only affects the process from which it’s called.

HTH!

saagarjha1y ago

1 more reply

swiftcoder1y ago

> I hope this helps!

It does not help, because you do not appear to have understood the article (or even read it all that closely).

Some of these bullet points feel a lot like the kind of junk output one sees from the various (popular, but flawed) AI summary tools...

vrtx01y ago

So, I’m real, and just trying to offer constructive feedback for a few errors I believe I noticed.

I could be wrong though —- could you be specific? I don’t want to misinform anyone…

1 more reply

colonial1y ago· 2 in thread

TIL that my set_env("RUST_LOG"...) calls at startup are technically unsafe. Funny.

I should see if the env_logger crate has a better solution.

loeg1y ago

At startup it's probably fine! It's safe in a single-threaded environment.

kurante1y ago

As long as they don't use `#[tokio::main]` or any other attribute that wraps main into an async function!

gavinhoward1y ago· 2 in thread

It is weird that I got this right before Rust did.

Then I can use code blocks to delimit where that stack should be popped. [1]

This is all perfectly safe, no `unsafe` required, and can even extend to other things like the current working directory. [2]

IMO, Rust got this wrong 10 years ago when Leakpocalypse broke. [3]

[1]: https://git.yzena.com/Yzena/Yc/src/branch/master/tests/yao/e...

[2]: https://gavinhoward.com/2024/09/rewriting-rust-a-response/#g...

[3]: https://gavinhoward.com/2024/05/what-rust-got-wrong-on-forma...

mmastrac1y ago

This isn't _really_ a Rust problem. Rust is a victim of POSIX.

If you have 1) C FFI interop in Yao, there's still a chance you might have two C libraries cause a crash without your code even being involved.

gavinhoward1y ago

Except if there is dymanic linking, I can use that to inject my own setenv and getenv, just like people inject jemalloc or other malloc alternatives.

hauntsaninja1y ago· 1 in thread

We had so many of these issues that we ended up LD_PRELOAD-ing patch getenv / setenv / putenv

msully4321OP1y ago

With a fixed implementation that leaks environments (like the one that just landed in glibc)?

einpoklum1y ago· 1 in thread

A function which sets global process state is not thread safe? Why, I'm shocked; shocked and chagrined.

But really, I don't understand why a sensitive security-related library would implicitly use an unsafe function like setenv().

bangaladore1y ago

> A function which sets global process state is not thread safe? Why, I'm shocked; shocked and chagrined.

This is a oversimplification. Windows has essentially the exact same API and it works just fine in multithreaded contexts.

The issue here is unix allows the underlying pointer to be accessed, bypassing any possible thread-safe APIs.

lopkeny12ko1y ago· 1 in thread

The whole point of Rust is memory safety, not thread safety...

masklinn1y ago

rikthevik1y ago

Great article about digging into a non-obvious bug. This one had it all! Intermittent bug, architecture-specific, hidden in a dependency, rust, the python GIL, gettext. Fantastic stuff.

datadeft1y ago

Couldn't we have a better pattern for this?

    if (__environ == NULL || name[0] == '\0')
      return NULL;

cuno1y ago

We ended up overriding and replacing with our own thread-safe version years ago when we also hit this.

janmatejka1y ago

This reminds of the time I was not able to get setproctitle to work in certain code base. Eventually I narrowed the issue to this line:

  import numpy

setproctitle() worked before numpy import but not after because it couldn't find the memory address of **environ.

loeg1y ago

env::set_var is marked unsafe now: https://doc.rust-lang.org/std/env/fn.set_var.html

And:

> This function is safe to call in a single-threaded program.

> This function is also always safe to call on Windows, in single-threaded and multi-threaded programs.

> In multi-threaded programs on other operating systems, the only safe option is to not use set_var or remove_var at all.

Meneth1y ago

From the backtrace, it seems strerror_r is not thread-safe, since it calls __dcigettext which calls getenv.

A similar bug related to setlocale was found in 2007 and fixed in 2014. That bug did not take getenv/setenv into account. https://sourceware.org/bugzilla/show_bug.cgi?id=5443

kazinator1y ago

This is not just a thread issue!

You run into a problem if you keep using a string returned by getenv after calling another environment function: including possibly getenv itself!

However, it's easy to just strdup the result of getenv; that defends against the issue in a single-threaded program.

roca1y ago

Switching from OpenSSL to rustls solves even more problems than expected.

up2isomorphism1y ago

Does posix say setenv us thread safe? If not, why complain about it?

throwaway20371y ago

Ref: https://man7.org/linux/man-pages/man3/setenv.3.html

... it clearly says: "MT-Unsafe"

Also, there is a whole section about get/set env thread safety here (under "Other safety remarks -> env"):

https://man7.org/linux/man-pages/man7/attributes.7.html

wakawaka281y ago

Sounds like you just didn't know it's not threadsafe. This is common knowledge in the C and C++ world.

j / k navigate · click thread line to collapse