Using a macro is more succinct:
const char *s = gc(xasprintf("%s/%s", dir, name));
Than what's being proposed: char *s = xasprintf("%s/%s", dir, name);
defer free(s);
See this x86 reference implementation of defer() and gc(). https://gist.github.com/jart/aed0fd7a7fa68385d19e76a63db687f... That should just work with GCC and Clang. That code is originally from the Cosmopolitan C Library (https://github.com/jart/cosmopolitan) so check it out if you like the Gist and want more!Please note the macro operates at function call boundaries rather than block scoped. I consider that a feature since it behaves sort of like a memory pool. Having side effects at the block scope level requires changing compilers, the language itself, and it would cause several important gcc optimization passes to be disabled in places where it's used.
- Inlining: it will work but the defer will be executed at the end of the caller function, which may not be what you expected.
- Tail call optimization: same issue. (You mention in a comment that you use a dummy asm statement to prevent the call to `__defer` itself from being tail-call optimized, but there's still an issue if one defer-using function tail-calls another defer-using function.)
- Function outlining aka hot/cold splitting (currently implemented in LLVM): arbitrary chunks of a function can be split out into their own functions; if one of those does a defer, the cleanup might be run too early, considerably more dangerous than too late.
- Various CFI (control-flow integrity) implementations that are specifically designed to prevent the return address from being overwritten (by exploits).
- Interprocedural register allocation (-fipa-ra in GCC) if __defer gets inlined or analyzed via link-time optimization: The compiler can make assumptions that functions won't modify certain registers that the ABI would normally allow them to modify; this will be violated if it unexpectedly jumps to __defer. This is fixable by marking __defer as __attribute__((noipa)) or reimplementing in assembly.
- Targeting WebAssembly or BPF or other high-level machines that don't support overwriting return addresses.
- Compilers that don't support inline assembly (MSVC).
EDIT: - Targeting ARM if the compiler happens to stash the return address in an unexpected location. You can't fix this by writing to LR like you suggested; the return address needs to be in LR when you execute the ret instruction, but the compiler doesn't need to keep it there for the whole function, and usually won't. Instead, it will usually save it to the stack frame, and load it back before returning, potentially using LR for completely unrelated purposes in between. So usually you can modify the return address by using __builtin_frame_address just like on x86. But that's an implementation detail; it could decide to keep a copy in another register, and move that to LR when returning. Not sure if any compilers actually do that, though I think I might have seen something like that on PowerPC.
Your approach is also relatively slow, since the cleanup code can't be inlined.
(If you're going to use a GNU extension for inline asm, why not just use the GNU extension __attribute__((cleanup))? It's block-scoped, but it doesn't disable any optimization passes or anything since the compiler knows about it, it's portable, and it doesn't have the problems I mentioned.)
Inlining, tailcall, hot/cold: None of these are issues. They don't change the fact that memory passed to gc() will be freed. Worst case scenario is the can gets kicked down the road, which is relatively easy to predict. See https://gist.github.com/jart/5aba7fc72c7b6781dadd5949c289a0b... So long as you're not using this technique to unlock mutexes, you'll be fine.
Developers who are required to use CFI need to reach out to their policymakers for authorization to modify return addresses before using the gc() macro. Folks required to use MSVC can use the existence of the gc() macro as compelling evidence for their bosses on the benefits of switching to GCC or Clang.
I'll take you on your word on IPA. I've added a comment to the Gist making sure folks who use it are aware. Thanks for the awesome info on ARM. That's good to know. Also I believe the code is fast.
I like the gc() macro because it can be used in expressions. I find __attribute__((__cleanup__)) unpleasant since it has strong opinions about how variables and cleanup functions need to be declared.
This does not sound good.
In addition, it is worth mentioning that your hack will probably break the return stack buffer of the x86 superscalar processors.
While a "defer" implementation would have neither of these two flaws.
No doubt it's a pretty neat hack, but I would be sceptical about using it extensively in place of a "defer".
That's true for inlining and tailcall, but not hot/cold, since it can make cleanups execute earlier than expected rather than later. This happens if: (1) some chunk of the function is extracted into a separate function; (2) the chunk contains a call to defer; and (3) some code which is not in the extracted chunk, but executes after it, expects the cleanup to not have run yet. See this example:
https://gcc.godbolt.org/z/1ro1z6
To be fair, Clang does not enable this optimization by default, and it might get replaced by a different implementation of hot-cold splitting [1] which happens to not suffer from this issue (because it operates later in the pipeline, essentially splitting blocks at the assembly level rather than the IR level).
On the other hand, GCC does enable a form of hot-cold splitting by default, via -fpartial-inlining, but it's limited to splitting out suffixes of the function (i.e. regions starting somewhere in the function and including everything in the function that can be executed from then on), rather than arbitrary regions. Therefore, it can't run into the problematic case where non-extracted code runs after extracted code. Still, this is just an implementation limitation that could be lifted in the future.
> I like the gc() macro because it can be used in expressions. I find __attribute__((__cleanup__)) unpleasant since it has strong opinions about how variables and cleanup functions need to be declared.
Fair enough; I do agree on that point. (I wish GCC had a way to either use attribute cleanup with C99 compound literals, or somehow declare variables that live within statement expressions as having their lifetime extended to a surrounding block… Maybe what I really want is a better macro system. Or the native `defer` feature proposed here, but I doubt that will ever happen.)
[1] https://lists.llvm.org/pipermail/llvm-dev/2020-August/144012...
Also, there is no inline assembly support in standard C, just in various compilers.
I think it was meant to be a portable language, not to let you write portable code. With the size of standard types being machine dependent you couldn't write completely portable code, but you could write C on a lot of hardware.
It's like how 8 bit computers all had BASIC but weren't compatible. If you knew one it was easy to get going on another because at some level it was all BASIC.
Though I still do it all manually, and am looking for ways to automate it.
Disclaimer: I haven't looked at the code too closely.
In section 1.1, the linearization it gives with goto statements is barely longer than the defer example. They claim defer is better just because of the proximity of the cleanup code? Why not just move the "resources acquired" code to a separate function? You wouldn't even need goto in that case, you could just nest if statements to do the cleanup.
The spec claims defer allocates memory. Why? As far as I know __attribute__((cleanup(fn))) doesn't allocate memory. This defer may exhaust memory, and if so, it will immediately terminate execution of the enclosing guard block with a panic() and DEFER_ENOMEM. So like an exception?
This says exit() or panic() will clean up all guarded blocks across all function calls of the same thread. So basically stack unwinding? Apparently you can recover somewhere with a call to recover()? This is just exceptions by another name. This stack unwinding can't possibly interoperate with existing code that expects error return values.
This claims it's robust because any deferred statement is guaranteed to be executed eventually, and it describes in great detail how it runs defer statements on signals. What if I write an infinite loop, or get a SIGKILL, or yank the power cord? Obviously deferred statements won't be executed.
This says defer is implemented with longjmp. Isn't setjmp/longjmp way too slow for exception handling? C++ compilers haven't done exceptions that way for decades. What happens if I longjmp or goto past a defer statement? This says it just doesn't invoke the defer mechanism and may result in memory leaks or other damage. Does that mean it's undefined behaviour? C++ won't compile a goto past constructors for good reason.
All POSIX error and signal codes have an equivalent prefixed with DEFER_, e.g. DEFER_ENOMEM, DEFER_HUP. This is just in case the system doesn't already have ENOMEM? Doesn't the standard already require that ENOMEM exist? If not, why not just make this feature require that ENOMEM exist? Why depend so much on errno for new core language features when it's basically an ugly artifact of ancient C library functions?
> If C will be extended with lamdas (hopefully in a nearer future)
I wouldn't hold my breath.
> Or is this just an experiment from someone's masters thesis or something?
The proposal has seven authors, three of which list industry affiliations and three various academic institutions. You're not required to know that some (all?) of the authors are on the C standard committee to tell that this is very probably a more serious proposal than someone's masters thesis.
> Why not just move the "resources acquired" code to a separate function? You wouldn't even need goto in that case, you could just nest if statements to do the cleanup.
That wouldn't work nicely with jumps out of the separate function. Not just with goto, but imagine the guarded block being a loop body and doing break/continue. The function would have to return some special value to indicate "I would like to break/continue here, please". Possible, but why would that be an improvement over goto for something that is clearly a goto use case that the compiler should handle?
> So basically stack unwinding?
You're saying this as if you had puzzled out the "real meaning" hidden inside this proposal. But the proposal doesn't hide that this is, yes, basically stack unwinding.
> This says defer is implemented with longjmp.
This says that this reference implementation, the goal of which is to allow people to test the ergonomics of the feature, is implemented with longjmp. The proposal itself is written to allow such an implementation, but it doesn't require it.
Stack unwinding is one of the largest, most complicated to implement, and most controversial features of C++. Google, a company with billions of lines of C++, famously disables exception handling in most or all of their code [1]. Not only do many popular C++ projects disable exceptions, but even C++ compilers themselves disable exceptions in their own implementation [2]!
C became popular in large part because of its simplicity of implementation. Some of these projects historically disabled C++ exceptions because they were slow and bloated, a result of the difficulty of implementing them efficiently. Now that they're fast and less bloated, these projects still can't turn them on because code that relies on stack unwinding is incompatible with code that does not. This proposal just repeats all of the same problems as C++.
It is extremely surprising that such a large group of people would propose such a radical feature addition to C, especially one that is so complicated to implement and that effectively makes code that uses it totally incompatible with old code that doesn't. This is interesting when viewed as a feature of a new language based on C, but the idea of adding this to C is frankly absurd.
[1]: https://google.github.io/styleguide/cppguide.html#Exceptions
[2]: https://llvm.org/docs/CodingStandards.html#do-not-use-rtti-o...
If you really do actually want answers, you should probably, in this order, (a) study the actual proposal in detail and not confuse it with an imperfect reference implementation, (b) check comp.std.c for previous discussions on the topic and maybe ask there, (c) see if there are other previous discussions involving the members who proposed this (maybe the standard committee has some semi-public mailing list or something?) and maybe ask there, (d) contact the email address given in the proposal. In all cases, it's probably a very good idea to stay as civil as you were in this post, not as confrontational as you were above.
I'm stressing that you should study the proposal because some of the things you got hung up before were not properties of the proposal but only of the reference implementation. Besides the longjmp issue, the dynamic allocation issue might be in this category as well. The proposal doesn't mention DEFER_ENOMEM, I think a compiler would have enough information in any case to allocate the needed space on the stack.
void * const p = malloc(25);
if (p != NULL) {
void * const q = malloc(25);
if (q != NULL) {
if (mtx_lock(&mut) != thrd_error) {
mtx_unlock(&mut);
}
free(q);
}
free(p);
}
At least to me, this flow is much easier to understand.The complete flow is immediately apparent in your example, but the "effective" flow (the two malloc and the mutex) is harder to identify.
And now imagine the same thing with 10 or more nested conditions.
With defer or goto, it allows you to mentally split the logic in two parts: on one hand the "effective" algorithm; and on the other hand the resource release.
Maybe a personal preference but I also like to keep my code flat. You are already 3 indentation levels deep without any logic in it.
Any platform where Clang and GCC aren't supported is a platform where this style of code shouldn't be used, no?
Check the implementation here: https://github.com/microsoft/GSL/blob/master/include/gsl/gsl....
Example from https://docs.microsoft.com/en-us/cpp/code-quality/c26448?vie...:
void poll(connection_info info)
{
connection c = {};
if (!c.open(info))
return;
auto end = gsl::finally([&c] { c.close(); });
while (c.wait())
{
connection::header h{};
connection::signature s{};
if (!c.read_header(h))
return;
if (!c.read_signature(s))
return;
// ...
}
}
I love this pattern, it's a very nice way to have a kind of RAII but with more control and flexibility.