We are members of the C Standard Committee and associated C experts, who have collaborated on a new book called Effective C, which was discussed recently here: https://news.ycombinator.com/item?id=22716068. After that thread, dang invited me to do an AMA and I invited my colleagues so we upgraded it to an AUA. Ask us about C programming, the C Standard or C standardization, undefined Behavior, and anything C-related!
The book is still forthcoming, but it's available for pre-order and early access from No Starch Press: https://nostarch.com/Effective_C.
Here's who we are:
rseacord - Robert C. Seacord is a Technical Director at NCC Group, and author of the new book by No Starch Press “Effective C: An Introduction to Professional C Programming” and C Standards Committee (WG14) Expert.
AaronBallman - Aaron Ballman is a compiler frontend engineer for GrammaTech, Inc. and works primarily on the static analysis tool, CodeSonar. He is also a frontend maintainer for Clang, a popular open source compiler for C, C++, and other languages. Aaron is an expert for the JTC1/SC22/WG14 C programming language and JTC1/SC22/WG21 C++ programming language standards committees and is a chapter author for Effective C.
msebor - Martin Sebor is Principal Engineer at Red Hat and expert for the JTC1/SC22/WG14 C programming language and JTC1/SC22/WG21 C++ programming language standards committees and the official Technical Reviewer for Effective C.
DougGwyn - Douglas Gwyn is Emeritus at US Army Research Laboratory and Member Emeritus for the JTC1/SC22/WG14 C programming language and a major contributor to Effective C.
pascal_cuoq - Pascal Cuoq is the Chief Scientist at TrustInSoft and co-inventor of the Frama-C technology. Pascal was a reviewer for Effective C and author of a foreword part.
NickDunn - Nick Dunn is a Principal Security Consultant at NCC Group, ethical hacker, software security tester, code reviewer, and major contributor to Effective C.
Fire away with your questions and comments about C!
- Locking down some categories of "undefined behaviour" to be "implementation defined" instead.
- Proper array support (which passes around the length along with the data pointer).
- Some kind of module system, that allows code to be imported with the possibility of name collisions.
There are efforts to define the behavior in cases where implementations have converged or died out (e.g., twos complement, shifting into the sign bit).
There have been no proposals to add new array types and it doesn't seem likely at the core language level. C's charter is to standardize existing practice (as opposed to invent new features), and no such feature has emerged in practice. Same for modules. (C++ takes a very different approach.)
Undefined behaviors tend to be undefined for a reason and shouldn't be thought of as defects in the standard. In my years on the committee, I have always argued to define as much behavior as possible and to as narrowly define undefined behaviors as possible.
We also had a recent discussion about adding additional name spaces (when discussing reserved identifiers), but it didn't gain much traction.
I second this one. One of the best things from Rust is its "fat pointers", which combine a (pointer, length) or a (pointer, vtable) pair as a single unit. When you pass an array or string slice to a function, under the covers the Rust compiler passes a pair of arguments, but to the programmer they act as if they were a single thing (so there's no risk of mixing up lengths from different slices).
What is needed is a category of actions where implementations (which I would call "conditionally-defined", where implementations would be required to indicate via "machine-readable" means [e.g. predefined macros, compiler intrinsics, etc.] all possible consequences (one of which would be UB), from which the implementation might choose in Unspecified fashion. If an implementation reports that it may process signed arithmetic using temporary values that may, at the compiler's leisure, be of an unspecified size that's larger than specified, but signed overflow will have no effect other than to yield values that may be larger than their type would normally be capable of holding, then the implementation would be required to behave in that fashion if integer overflow occurs.
In general, the most efficient code meeting application requirements could be generated by demanding semantics which are as loose as possible without increasing the amount of user code required to meet those requirements. Because different implementations are used for different purposes, so single set of behavioral guarantees would be optimal for all purposes. If compilers are allowed to reject code that demands guarantees an implementation doesn't support, then the choice of what guarantees to support could safely be treated as a Quality of Implementation issue, but the behavior of code that claims to support guarantees would be a conformance issue.
That doesn't particularly need modules -- just some form of
namespace foo {
}On a slightly more personal note: What are some undefined behaviors that you would like to turn into defined behavior, but can't change for whatever reasons that be?
Quoting http://blog.llvm.org/2011/05/what-every-c-programmer-should-...
> This behavior enables certain classes of optimizations that are important for some code. For example, knowing that INT_MAX+1 is undefined allows optimizing "X+1 > X" to "true". Knowing the multiplication "cannot" overflow (because doing so would be undefined) allows optimizing "X*2/2" to "X". While these may seem trivial, these sorts of things are commonly exposed by inlining and macro expansion. A more important optimization that this allows is for "<=" loops like this:
> for (i = 0; i <= N; ++i) { ... }
> In this loop, the compiler can assume that the loop will iterate exactly N+1 times if "i" is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses "int" as induction variables.
for (int i = 0; i < N; i += 2) {
//
}
Reasonably common idea but the compiler is allowed to assume the loop terminates precisely because signed overflow is undefined.I’m not trying to argue that signed overflow is the right tool for the job here for expressing ideas like “this loop will terminate”, but making signed overflow defined behavior will impact the performance of numerics libraries that are currently written in C.
From my personal experience, having numbers wrap around is not necessarily “better” than having the behavior undefined, and I’ve had to chase down all sorts of bugs with wraparound in the past. What I’d personally like is four different ways to use integers: wrap on overflow, undefined overflow, error on overflow, and saturating arithmetic. They all have their places and it’s unfortunate that it’s not really explicit which one you are using at a given site.
It might seem like defining the semantics for signed overflow would be helpful but it turns out it's not, either from a security view or for efficiency. In general, defining the behavior in cases that commonly harbor bugs is not necessarily a good way to fix them.
Personally, I would like to get rid of many of the trap representations (e.g., for integers) because there is no existing hardware in many cases that supports them and it gives implementers the idea that uninitialized reads are undefined behavior.
On the other hand, I just wrote a proposal to WG14 to make zero-byte reallocations undefined behavior that was unanimously accepted for C2x.
If people used them while parsing binary inputs that would prevent a lot of security bugs.
The fact that this question exists and is full of wrong answers suggests a language solution is needed: https://stackoverflow.com/questions/1815367/catch-and-comput...
Your code assumes that negating a negative value is positive. Your division check forgot about INT_MIN / -1. Your signed integer average is wrong. You confused bitshift with division. etc. etc. etc.
Unsigned arithmetic is tractable and should be treated with caution. Signed arithmetic is terrifying and should be treated with the same PPE as raw pointers or `volatile`.
This applies if arithmetic maps to CPU instructions, but not to Python or Haskell or etc. If you have automatic bignums, signed arithmetic is of course better.
I presume you'd want signed overflow to have the usual 2's-complement wraparound behavior.
One problem with that is that a compiler (probably) couldn't warn about overflows that are actually errors.
For example:
int n = INT_MAX;
/* ... */
n++;
With integer overflow having undefined behavior, if the compiler can determine that the value of n is INT_MAX it can warn about the overflow. If it were defined to yield INT_MIN, then the compiler would have to assume that the wraparound was what the programmer intended.A compiler could have an option to warn about detected overflow/wraparound even if it's well defined. But really, how often do you want wraparound for signed types? In the code above, is there any sense in which INT_MIN is the "right" answer for any typical problem domain?
Sometimes you're writing code where it really, really matters and you're more than willing to spend the extra cycles for every add/mul/etc. Having these new types as a portable idiom would help.
How can one process unicode (UTF-8) properly in C? As a CJK person, I wish there was a robust solution. Are there any standardized ways or proposals? (Using wchar doesn't count.)
I recommend heading toward a future where only UTF-8 encoding is used for multibyte characters and UCS-2 or similar for wchar_t. There is no need to support several different encodings.
Basically store the text as char arrays, and convert them when needed. Meanwhile, you could use this single file header: https://github.com/RandyGaul/cute_headers/blob/master/cute_u...
However, the book only describes the available standard functions, so even doing better than other manuals, everything it has to say on this subject fits in one chapter and feel underpowered.
Here are examples of working with unicode in C: https://begriffs.com/posts/2019-05-23-unicode-icu.html
(strchr is the most obvious, but in general most search/lookup type functions are like this...)
Add to clarify: the current prototype for strchr is
char *strchr(const char *s, int c);
Which just drops the "const", so you might end up writing to read-only memory without any warning. Ideally there'd be something like: maybe_const_out char *strchr(maybe_const_in char *s, int c);
So the return value gets const from the input argument. Maybe this can be done with _Generic? That kinda seems like the "cannonball at sparrows" approach though :/
(Also you'd need to change the official strchr() definition...)There are plenty of programing languages that distinguish strongly between mutable and immutable references, and that have the parametric polymorphism to let functions that can use both kinds return the same thing you passed to them, though. C will simply just never be one of them.
With that, selecting the correct function via `_Generic` should be possible (`_Generic` is a bit fiddly, but matching on `const char * ` and `char * ` should work just fine for this), and for the most part this is actually an/the intended use case for `_Generic` - it's basically the same as the type-generic math functions, more or less.
But making function signatures const-correct solves only a small part of the problem. A new API can only be used in new code, and casts can remove the constness from pointers leaving open the possibility that poorly written code will inadvertently change the const object. An attempt to change a global variable declared const will in all likelihood crash, but changing a local const can cause much more subtle bugs.
In my view, a more complete solution must include improving the detection of these types bugs in compilers and other static and even dynamic analyzers even without requiring code changes. It's not any more difficult to do that detecting out of bounds accesses. (In full generality it cannot be done just by relying on const; some other annotation is necessary to specify that a function that takes a const pointer doesn't cast the constness away and modify the object regardless.)
C++ solved this by overloading strchr():
const char *strchr(const char *s, int c);
char *strchr(*char *s, int c);
C of course doesn't have overloading.One solution could have been to define two functions with different names, perhaps "strchr" and "strcchr". The time to do that would have been 1989, when the original ANSI C standard was published.
I suppose a future C standard could leave strchr() as it is (necessary to avoid breaking existing code) and add two new functions.
Back in the '80s and '90s I was pretty good at C. I don't think there was anything about the language or the compilers than that I did not understand. I used C to write real time multitasking kernels for embedded systems, device drivers and kernel extensions for Unix, Windows, Mac, Netware, and OS/2. I did a Unix port from swapping hardware to paging hardware, rewriting the processes and memory subsystems. I tricked a friend into writing a C compiler. I could hold my own with the language lawyers on comp.lang.c.
Somewhere in there I started using C++, but only as a C with more flexible strings, constructors, destructors, and "for (int i = ...)", and later added STL containers to that.
Sometime in the 2000s, I ended up spending more and more time on smaller programs that were mostly processing text, and Perl became my main tool. Also I ended up spending a lot of helping out less experiences people at work who were doing things in PHP, or JavaScript, or Java. My C and C++ trickled to nothing.
I've occasionally looked at modern C++, but it is so different from what I was doing back in '90s or even early '00s I sometimes have to double check that I'm actually looking at C++ code.
Is modern C like that, or is it still at its core the same language I used to know well?
What might perhaps be more challenging is adjusting to the changes in compilers. They tend to optimize code more aggressively and so writing code that closely follows the rules of the language (rather than making assumptions about the underlying hardware, even valid ones) is more important today than it was back in the 80's.
https://gforge.inria.fr/frs/download.php/latestfile/5298/Mod...
(Homepage: https://modernc.gforge.inria.fr/ )
Aside from understanding how the language itself has changed, maybe something else to put on the list is how to apply more modern programming practices in C.
In the 90s, I don't think I ever saw C code with unit tests. Any kind of automated testing was pretty rare. I've become convinced that testing in some form is a good thing. If I were going back to C, I'd want to understand the best way to go about that.
People also didn't care (or know) much about security back then. C has some obvious pitfalls (buffer overflows, etc.), and it is pretty important to know good ways to minimize risk. I'd want to understand best practices and techniques for this.
Also, back then build tools were very simple, and some of them were not my favorite things to use (Imake, I'm looking at you). Build tools have advanced a lot since then. Features like reliable, deterministic incremental builds exist now. Some things could be less tedious to configure and maintain. There are probably best practices and preferred choices in build tools, but what exactly they are is another thing I'd want to know.
These are probably not questions that necessarily need an answer from people whose expertise is the language itself, though, so I guess this is a tangent.
Some hints on what I'm referring to can be found here: https://github.com/mpv-player/mpv/commit/1e70e82baa9193f6f02...
Unrelated, but I also miss a binary constant notation (such as 0b10101)
0xDEADB_EEF
0b1_010_110111001001
etc.
The GoLang defer statement defers the execution of a function until the surrounding function returns. The deferred call's arguments are evaluated immediately, but the function call is not executed until the surrounding function returns. It looks like an interesting mechanism for cleaning up resources.
For me "defer" only makes sense in the context of exceptions, basically as an equivalent to "finally". This is a slippery slope though, since golang's exceptions are, for a reason, rudimentary.
Would the proposed defer statement apply to loops as well? How would one implement such defers without dynamic allocation?
Right now it's just scary to start a new project in C. It would be really great if there was more emphasis on correctness of the produced code instead of the insane optimizations.
If you are concerned about safety there are ways to achieve that, like using MISRA C, formally verifying your C, or by writing another language like Rust.
Thankfully, there are a lot of tools to help developers catch UB these days (UBSan, static analyzers, valgrind, etc). I would recommend using those tools whenever starting a new project in C (or C++, for that matter).
int i;
[…]
i += 1;
potentially is undefined behavior; i could overflow.Compilers nowadays are fairly good at warning about definite undefined behavior.
I don’t think anybody would be happy with a compiler that aborted on all potential undefined behavior. That would (almost) be equivalent to banning the use of all signed ints.
Adding a rule requiring implementations to error out in cases of undefined behavior would be hard to specify in the standard. It could (and in my view should) be done by providing non-normative encouragement as "Recommended Practice."
IIRC, this macro was added to C11 along with a batch of other "these are optional" macros for atomics, complex, threads, etc. However, I don't recall whether C99 adopted the features as optional features and missed the feature testing macro, or if they were required features in C99 that we made optional in C11.
Is a controverial feature, that can produce bugs, and are banned in a lot of project (one famouse, the Linux kernel).
Safe strcpy
char *stecpy(char *d, const char *s, const char *e)
{
while (d < e && *s)
*d++ = *s++;
if (d < e)
*d = '\0';
return d;
}
main() {
char buf[64];
char *ptr, *end = buf+sizeof(buf) ;
ptr = stecpy(buf, "hello", end);
ptr = stecpy(ptr, " world", end);
}
Existing solutions are still error-prone, requiring continual recalculation of buffer len after each use in a long sequence, when the only thing that matters is where the buffer ends, which is effectively a constant across multiple calls.What are the chances of getting something like this added to the standard library?
There is the problem of detecting that the function overflows despite being a “safe” function. And there is the problem of precisely predicting what happens after the call, because there might be an undefined behavior in that part of the execution. When writing to, say, a member of a struct, you pass the address of the next member and the analyzer can safely assume that that member and the following ones are not modified. With a function that receives a length, the analyzer has to detect that if the pointer passed points 5 bytes before the end of the destination, the accompanying size it 5, if the pointer points 4 bytes before the end the accompanying size is 4, etc.
This is a much more difficult problem, and as soon as the analyzer fails to capture this information, it appears that the safe function a) might not be called safely and b) might overwrite the following members of the struct.
a) is a false positive, and b) generally implies tons of false positives in the remainder of the analysis.
(In this discussion I assume that you want to allow a call to a memory function to access several members of a struct. You can also choose to forbid this, but then you run into a different problem, which is that C programs do this on purpose more often than you'd think.)
*p += sprintf(*p, "hello");
*p += sprintf(*p, "world");2. Is there, or was there ever a proposal to make struct types without a tag be structurally typed? This would not break backwards compatibility as far as I can see, and would make these types much more useful as ad-hoc bags of data. Small example:
struct {size_t size; void *data;} data = get_data();
int hash = hash_data(data);
I believe there was at least one proposal about error handling that more or less relied on the above to be valid semantically.3. Is there any interest in making the variadic function interface a bit nicer to use? I would like to bring back an old feature and have an intrinsic to extract a pointer from the variadic parameter list, so that we can iterate over it ourselves (or even index directly).
void *arg_ptr = va_ptr(last);
More out there would be a parameter that would be implicitly passed to a variadic function to indicate the number of arguments. void variadic(..., va_size count) {
}
variadic(10, 20, 30); // count would be three1. I want this too.
2. Here is my proposal: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2366.pdf
3. Yes, variadic functions should be improved.
(The committee is fine with incremental improvements, but new syntax need to have strong motivation behind it, much stronger than this.)
That being said, I would like it if the default types for variadic functions were promoted from int/float to int64_t/double in order to be more reflective of the wider ranges supported by these types.
My job at NCC Group involves a lot of code reviews, so frequently the files that are of interest to me are the ones that contain the most defects. I typically identify these by compiling with compiler warnings turned up and warning suppression turned down. I'll frequently also make use of static and dynamic analysis, including the GCC and Clang sanitizers.
exctags --exclude=TAGS --exclude=TAGS.NEW --append -R -f TAGS.NEW --sort=yes && mv TAGS.NEW TAGS
My editor (vim) has native support for quickly jumping from a use to definition via this TAGS index. History is preserved (i.e., there is a "back" button), so you can quickly dive through 5 layers of API and back out to understand where a value went. It is quite useful for starting with what you know and following it to the surprising behavior, without executing the code.It's hard to appreciate what's going on at WG14 (or take part) when you can see the results only from afar, with none of the surrounding discussion.
I recently read Jens Gustedt's blog on C2x where he casually recommended this as a way to get involved: "The best is to get involved in the standard’s process by adhering to your national standards body, come to the WG14 meetings and/or subscribing to the committee’s mailing list."
Afaict (from browsing the wg14 site), the mailing list and its archives are not open to access.
https://webcache.googleusercontent.com/search?q=cache:TnEGL4...
EDIT: In general, how is one supposed to approach wg14 with ideas or need for clarification on the standard's wording / interpretation?
I'm currently working on an update to the committee website to clarify exactly this sort of thing! Unfortunately, the update is not live yet, but it should hopefully be up Soon™.
Currently, the approach for clarifications and ideas both require you to find someone on the committee to ask the question or champion your proposal for you. We hope to improve this process as part of this website update to make it easier for community collaboration.
component.h:
struct obj;
typedef struct obj obj_t;
obj_t *obj_create(void);
// .. the rest of the API
component.c: struct obj {
int status;
// .. whatever else
};
obj_t *
obj_create(void)
{
return calloc(1, sizeof(obj_t));
}
However, as the component grows in complexity, it often becomes necessary to separate out some of the functionality (in order to re-abstract and reduce the complexity) into a another file or files, which also operate on "struct obj". So, we move the structure into a header file under #ifdef __COMPONENT_PRIVATE (and/or component_impl.h) and sprinkle #define __COMPONENT_PRIVATE in the component source files. It's a poor man's "namespaces".Basically, this boils down to the lack namespaces/packages/modules in C. Are you aware of any existing compiler extensions (as a precedent or work in that direction) which could provide a better solution and, perhaps, one day end up in the C standard?
P.S. And if C will ever grow such feature, I really hope it will NOT be the C++ 'namespace' (amongst many other depressing things in C++). :)
What I can say while we are on the subject, is that I have seen C code (most often C code that started its life in the 1990s, to be fair) that instead of showing an abstract struct in the public interface, showed a different struct definition.
Please don't do this. Yes, when compiling nowadays, eventually every compilation unit ends up as object files passed to a linker that doesn't know about types, but this is undefined behavior. It makes it difficult to find undefined behavior in the rest of the code because there is a big undefined behavior right in the middle of it.
You can spread components in as many object files or libraries as you wish.
IMHO it's not a C related problem but a code design one.
Write libraries (with headers) only if you need to share the code but if you're not sure about that just include it for your specific program.
There is no shame to include local files containing declarations and definitions.
I think it is a misconception from C programmers to write headers for local purpose.
In other words, will the C standard be effectively “done” at some time in the future?
Reviewing proposals to incorporate features supported by common implementations is another.
Aligning with other standards (e.g., floating point) and improving compatibility with others (C++) is yet another.
In general, when an ISO standard is done it essentially becomes dead. So for the C standard to continue to be active (on ISO's books) it needs to evolve.
http://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log...
These papers are usually quite interesting.
This is useful for parallel computations, optimizations and readability, e.g.
sum += f(2);
sum += f(2);
can be optimized to x = f(2);
sum += x;
sum += x;
Would the current motto of the consortium forbid adding a feature such as marking a function as pure, that would not just promise, but also enforce that no side effects are caused (only local reads/writes, only pure functions may be called), and no inputs except for the function arguments are used?Suppose I want to add some debug tracing into f():
f.c: 42: f entered
f:c: 43: returning 2
that's a side effect, right? But now the pure attribute tells a lie. Never mind though; I don't care that some calls to f are "wrongly" optimized away; I want the tracing for the ones that aren't.In C++ there are similar situations involving temporary objects: there is a freedom to elide temporary objects even if the constructors and destructors have effects.
Even a perfectly pure function can have a side effect, namely this one: triggering a debugger to stop on a breakpoint set in that function!
If a call to f(2) is elided from some code, then that code will no longer hit the breakpoint set on f.
Side effect is all P.O.V. based: to declare something to be effect-free in a conventional digital machine, you have to first categorize certain effects as not counting.
There is at least one incorrect optimization present in Clang because of this (function that has no side-effects detected as pure, and call to that function omitted from a caller on this basis, when in fact the function may not terminate).
If you were enforcing this with the compiler, you would also need something that would suppress the enforcing, because the millions of pre-existing functions would probably not get an updated attribute marking it as pure. And once you do that, the compiler can't really trust anything that function does, because it may actually be calling a non-pure function.
Or incorporating features from this 14 item list? https://blog.regehr.org/archives/1180
As it appears these have failed: https://blog.regehr.org/archives/1287
Making C friendlier is always a good idea, and I think the committee is (slowly) working towards this goal. I would have to examine these papers by John Regehr in more detail. Looking quickly at his proposals I can see why there he couldn't find consensus for these ideas as some of them do appear controversial.
An example of a friendly dialect of C is always is C0 (C-naught) from CMU. I don't think I'm exaggerating when I say that this language has not "caught on".
int test(int a, int b)
{
int c = a/b;
if (f1())
f2(a,b,c);
}
Should a compiler be required to compute c before calling f1, and thus have to store the value of c across the function call?Better would be to define a set of semantics for loosely-sequenced traps, along with "causality barriers" to ensure that they only occur at tolerable times.
For instance, we added the '_Bool' data type and require you to include <stdbool.h> to spell it 'bool' instead and to get 'true' and 'false' identifiers. This was done to not impact existing code bases that had their own bool/true/false implementation with those spellings. Now that "enough" time has passed for legacy code bases to update, we're looking into making these "first-class" features of the language and not requiring <stdbool.h> to be included to use them. We're doing the same for things like _Static_assert vs static_assert, etc for the same reason.
For example: removing the register keyword, always requiring a return statement, etc etc.
A lot of changes can me made that will make static analysis easier.
There will always be people with 50 year old code bases that will never change (and some c89 compiler will always be there for them), but the language is pervasive enough that it deserves progressive changes to make it (even) simpler and safer and slightly more high level.
Just #include <stdc.h> and be done with it. No need to remember stdio, stdint, stdbool, limits, assert, signal.h, etc, etc.
This new header comes with a guarantee that use of identifiers in the standard-reserved namespace will break your code. Perhaps compilers could even enforce this preemptively.
EDIT: I work on embedded systems, where C is king, and it seems like a spend an inordinate amount of time working with code generators that build simple tables. All of which could go away with this feature.
I’ve always thought that idiomatic C for constexpr would be to write the code you want to be executed at compile-time in a separate file(s), build it, execute and then #include the result in your program before building the final executable, adding a build step but keeping overall complexity minimal.
This is different from C++ approach, where everything and the kitchen sink is added to the standard and then you have to issue errata for errata for the standard and hope that the compiler you have to use for your current platform is keeping up for the last changes.
It's used by a lot of current software in Linux, notably systemd and glib2. It solves a major headache with C error handling elegantly. Most compilers already support it internally (since it's required by C++). It has predictable effects, and no impact on performance when not used. It cannot be implemented without help from the compiler.
int do_something(void) {
FILE *file1, *file2;
object_t *obj;
file1 = fopen("a_file", "w");
if (file1 == NULL) {
return -1;
}
defer(fclose, file1);
file2 = fopen("another_file", "w");
if (file2 == NULL) {
return -1;
}
defer(fclose, file2);
obj = malloc(sizeof(object_t));
if (obj == NULL) {
return -1;
}
// Operate on allocated resources
// Clean up everything
free(obj); // this could be deferred too, I suppose, for symmetry
return 0;
}FWIW, I don't think it would wind up being spelled with attribute syntax because we would likely want programmers to have a guarantee that the cleanup will happen (and attributes can be ignored by the implementation).
I would like to see the Standard either rewritten in such a way as to actually define (sometimes as optional features) everything necessary to make an implementation suitable for a wide range of tasks, or else expressly state that, e.g. "There are some circumstances where the behavior of some action would documented by parts of the Standard, the documentation of the implementation and execution environment, or other materials, but some other portions of the Standard would characterize those actions as invoking Undefined Behavior. This Standard expressly waives jurisdiction in such cases so as to allow implementations designed for a variety of purposes to process them in whatever fashion would best suit those purposes."
What would you think about including something like those last two sentences in the Standard, so as to help clarify its intention?
I followed the article which attempted to interpret the C standard and come to a conclusion. The conclusion is:
> The takeaway message is that pointer arithmetic is only defined for pointers pointing into array objects or one past the last element. Comparing pointers for equality is defined if both pointers are derived from the same (multidimensional) array object. Thus, if two pointers point to different array objects, then these array objects must be subaggregates of the same multidimensional array object in order to compare them. Otherwise this leads to undefined behavior.
Based on the above, I arrived at the conclusion after reading this that comparing two distinct malloc()'d pointers for equality itself is undefined behaviour since malloc() is likely to return pointers to distinct objects that are not part of a sub-aggregate object.
I know this is incorrect, but I don't know why I'm wrong.
[1]: https://stefansf.de/post/pointers-are-more-abstract-than-you...
&a + 1 == &b is unspecified: it may produce 0 or 1, and it may not produce the same result if you evaluate it several times.
Similarly, if both the char pointers p and q were obtained with malloc(10), after they have been tested for NULL, all these operations are valid:
p == q (false)
p + 1 == q (false)
p + 1 == q + 1 (false)
p + 10 == q + 1 (false)
Only p+10 == q and p == q+10 are unspecified (of the comparisons that can be built without invoking UB during the pointer arithmetic itself).I have no idea what led that person to (apparently) write that &a==&b is undefined. This is plain wrong. I do not see any ambiguity in the relevant clause (https://port70.net/~nsz/c/c11/n1570.html#6.5.9p6 ). Yes, the standard is in English and natural languages are ambiguous, but you might as well claim that a+b is undefined because the standard does not define what the word “sum” means (https://port70.net/~nsz/c/c11/n1570.html#6.5.6p5 ).
Relational operators (< <= > >=) on pointers have undefined behavior unless both pointers point to elements of the same array object or just past the end of it. A single non-array object is treated as a 1-element array for this purpose.
(That's for object pointers. Function pointers can be compared for equality, but relational operators on function pointers are invalid.)
unsigned int x;
x -= x;
There's a lengthy StackOverflow thread where various C language-lawyers disagree on what the spec has to say about trap values, and under what circumstances reading an uninitialised variable causes UB. I'd appreciate an authoritative answer. Thanks for dropping by on HN!You could argue that it suddenly becomes less UB if you take the address of x:
unsigned int x;
&x;
x -= x;
I'm not sure if this will add anything to the discussion on SO, but if you allow programs to do this, then after applying modern optimizing C compilers, you may end with multiplications by 2 that produce odd results, or uninitialized char variables that contain 500: http://blog.frama-c.com/index.php?post/2013/03/13/indetermin...So the short answer is that, for all intent and purposes, you should consider use of uninitialized variables as UB, because C compilers already do. (There exists somewhere a document clarifying what C compilers can and cannot do with indeterminate values. A search for “wobbly values” might turn it up. Anyway, you do not want to have wobbly values in your C programs any more than you want it to have undefined behavior.)
https://www.digitalmars.com/articles/C-biggest-mistake.html
I.e. offering a way that arrays won't automatically decay to pointers when passed as a function parameter.
Your proposal replaces my use of an array with two things, a pointer (as before) and a length. This is not too helpful, because I already could have done that if I'd wanted to.
What is missing is the ability to pass an array. Sometimes I want to toss a few megabytes on the stack. Don't stop me. I should be able to do that. The called function then has a copy of the original array that it can modify without mangling the original array in the caller.
The committee works a lot lobbyist. A minority of people with a large financial interest in the technology (such as compiler writers) have undue influence because they participate in the process. I always encourage C language users to take a more active role, but they usually don't. Cisco is an example of user community that actively takes part in C Standardization.
I hope that this perceived lack of enthusiasm means I am handling the conflict of interest honorably.
For example, if all past, current and contemplated hardware behaves in the same way, I assume that the standard will simply enshrine this behavior.
However, what if 99% of hardware behaves one way and 1% another? Do you set the behavior to "undefined" to accommodate the 1%? At what point to you decide that the minority is too small and you'll enshrine the majority behavior even though it disadvantages minority hardware?
---
[1] Famous examples include things like bit shift and integer overflow behavior.
We have dropped support for sign and magnitude and one's complement architectures from C2x (a decision Doug Gwyn does not agree with). There was some concern that Unisys may still use a one's complement architecture, but that this may only be in emulation nowadays.
Practically every second C codebase on earth has their own implementations of these at some point, and it remains a huge problem for e.g. writers of libraries, where you don't know how/where your library will be used.
2. I think many compilers already do this, but can the static initialization rules be relaxed a bit?
static const int a = 0;
static const int b = a; /* This is not standard C afaik. */
Thank you,
CodeandCI'd also say there is consensus that (2) would be beneficial. There are some good ideas in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2067.pdf although I don't think repurposing the register keyword for it was very popular. Not just because it wouldn't be compatible with C++ which deprecated register some time ago, but also because it's novel with no implementation or user experience behind it. My impression that this is waiting for a new proposal.
1. Will the Apple's Blocks extension, which allows creation of Closures and Lambda functions, be included in C2X?
2. Are there any plans to improve the _Generic interface (to make it easy to switch on multiple arguements, etc.)?
We haven't seen a proposal to add them to C2x, yet. However, there has been some interest within the committee regarding the idea, so I think such a proposal could have some support.
> 2. Are there any plans to improve the _Generic interface (to make it easy to switch on multiple arguements, etc.)?
I haven't seen any such plans, but there is some awareness that _Generic can be hard to use, especially as you try to compose generic operations together.
Mbed TLS, since I have it in mind from another thread, is also a pretty clean C library for the problem it tries to solve; it's a testament to its design that we (TrustInSoft, who had not participated to its development) were able to verify that some uses of the library were free of Undefined Behavior: https://tls.mbed.org
Do you think Annex K of C11 will be widely adopted by programmers or unused? Why aren't people adopting it?
Do you see the use of any analysis tools that are particularly effective for finding memory safety issues?
C++ added in smart pointers to its specification. Are there any plans to do something similar in future C specifications?
Thanks!
So far, it's not been widely adopted. Part of the issue is that there are specification issues relating to threads and the constraint handlers, and part of the issue is that popular libc implementations have actively resisted implementing the annex.
That said, I field questions about Annex K on a regular basis and there are a few implementations in the wild, so there is user interest in the functionality.
> Do you see the use of any analysis tools that are particularly effective for finding memory safety issues?
<biased opinion>I think CodeSonar does a great job at finding memory safety issues, but I work for the company that makes this tool.</biased opinion>
I've also had good luck with the memory and address sanitizers (https://github.com/google/sanitizers) and tools like valgrind.
> C++ added in smart pointers to its specification. Are there any plans to do something similar in future C specifications?
We currently don't have any proposals for adding smart pointers to C. Given that C does not have constructors or destructors, we would have to devise some new mechanism to implement or replace RAII in C, which would be one major hurdle to overcome for smart pointers.
Joining the committee requires you to be a member of your country's national body group (in the US, that's INCITS) and attend at least some percentage of the official committee meetings, and that's about it. So membership is not difficult, but it can be expensive. Many committee members are sponsored by their employers for this reason, but there's no requirement that you represent a company.
I joined the committees because I have a personal desire to reduce the amount of time it takes developers to find the bugs in their code, and one great way to reduce that is to design features to make harder to write the bugs in the first place, or to turn unbounded undefined behavior into something more manageable. Others join because they have specific features they want to see adopted or want to lend their domain expertise in some area to the committee.
Is this enough to answer your question? I can look up the names of the people that were involved and communicate them privately if you are further interested.
Unsigned arithmetic never overflows, and guarantees two's-complement behavior, because unsigned arithmetic is always carried out modulo 2^n:
> A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type. (6.2.5, Types)
Doing the computation in unsigned always does the "right thing"; the thing that one needs to be careful of with this approach is the conversion of the final result back to the desired signed type (which is very easy to get subtly wrong).
I have a love-hate relationship with C - I like it for small projects, but anything serious I really need to write it in a more safe language. I think GCC has some flags that can help, and I've been using tools like splint, but something baked into the standard would be amazing.
I guess what I mean by that is a language that has Rust's hyperactive, strongly opinionated compiler, borrow checker, no NULL, immutable by default, etc, but in a language that is no more syntactically ambitious that C89. I would be way more into a language like that than Rust.
A language that sort of feels like Go, but can actually be used for low-level systems programming.
On another note:
- Official support for __attribute__
- void pointers should offset the same size as char pointers.
- typeof (can't stress this one enough)
- __VA_OPT__
- inline assembly
- range designated initializer for arrays
- some GCC/Clang builtins
- for-loop once (Same as for loop, but doesn't loop)
Finally, stop putting C++ craps into C.
I know some people are against metaprogramming because they believe the abstractions hide the intrinsic of how the underlying code will execute, but I would love to write substantial tests in C without relying on FFI to Python or C++ to perform property-based testing, complex fuzzing, and whatever. I feel metaprogramming would be a huge boon for C tooling and developer productivity.
Do you also answer questions about the standard libraries? This is not so much a C question as a library question:
I'm wondering if Apple's Grand Central Dispatch ever made it into a more integrated role in C's libraries, or if it will forever remain an outside add-on. And whether there is anything else at that level (level in the sense of high versus low level) in the standard libraries that plays such a role, that I should read up on instead of GCD.
We're remaining active while there are still people asking questions, so the west coast folks should hopefully have the chance to ask what they'd like.
> Do you also answer questions about the standard libraries?
Sure!
> I'm wondering if Apple's Grand Central Dispatch ever made it into a more integrated role in C's libraries, or if it will forever remain an outside add-on.
GCD has not been adopted into C yet, and I don't believe it's even been proposed to do so by anyone (or an alternative to GCD, either).
It would be an interesting proposal to see fleshed out for the committee, and there is a lot of implementation experience with the feature, so I think the committee would consider it more carefully than an inventive proposal with no real-world field experience.
Conceptually, something must indicate to the function how many arguments it is supposed to request next, and with what types. Yes, you could write a function where this information is passed through a static-lifetime variable, but in practice the first mandatory argument is almost always used for that anyway.
Another thing very cumbersome is to do in C is object creation; creating instantiable objects is possible very cumbersome. Is there some feature in the thoughy process to deal with it. To make it clear, in C we can create a data structure like a Stack or a queue easily. But if the program needs 10 stacks then presently no simple way of achieving it.
To minimize the external identifiers, one could make just the name of a container structure the sole entry access handle, with structure members pointing to the functions. Then use it like:
#include <Mm.h>
if ((new = Mm.allo(size)) == NULL)
Er.abort("out of memory");What's the plan for C over the next 5 - 10 years?
One goal is to re-unify C with the concurrency object model used by C++ to make std::atomic<T> and _Atomic(T) be ABI compatible as intended in C11. Some small fixes in this area are the removal of ATOMIC_VAR_INIT, clarifying whether library functions can use thread_local storage for internal state, and things along those lines. However, we expect there to be more efforts in this area as we progress the standard.
- stricter type-checks on typedef types (useful when passing function parameters) - gcc's ' warn_unused_result' attribute for functions (ensure error returns are checked) - on-entry/on-exit qualifiers for functions (to do things like make sure you lock/unlock semaphores for instance before entry/exit of function) - D language's 'scope' feature (better handling of error path) - loops in the c pre-processor! (better code-gen)
Any chance any of this is on the radar for the next-gen C standard? Some of these are just ergonomics, but the first two might've have saved me some grief a few times.
I wouldn't mind seeing a new feature that does define a new type (one that's identical to, but incompatible with, an existing type), but we can't call it "typedef".
In a sense that feature already exists. You can define a structure with a single member of an existing type. But you have to refer to the member by name to do anything with it.
“Static analysis” may be the wrong name to classify the tools that work in that area, because “static analysis” is usually used for purely automatic tools, whereas the tools used to guarantee the absence of undefined behaviors are not entirely automatic except for the simplest of programs.
Results of a static analyzer are often characterized in terms of “false positives” and “false negatives”. It is a possible design choice to make an analyzer with no false negatives. It is absolutely not impossible! (Some people think it is fundamentally impossible because it sounds like a computer science theorem, but it isn't one. The theorem would apply if one intended to make an analyzer with no false positives and no false negatives—and if computers were Turing machines.)
Analyzers designed to have no false positives are called “sound”. In practice, this kind of analyzer may prove that a simple program is free of Undefined Behavior if the program is a simple example of 100 lines, but for a more realistic software component of at least a few thousand lines, the result will be obtained after a collaborative human-analyzer process (in which the analyzer catches reasoning errors made the human, so the result is still better than what you can get with code reviews alone).
Here is what the result of this collaborative human-analyzer process may look like for a library as cleanly designed and self-contained as Mbed TLS (formerly PolarSSL): https://trust-in-soft.com/polarSSL_demo.pdf?
https://news.ycombinator.com/item?id=22865357&p=2
https://news.ycombinator.com/item?id=22865357&p=3
https://news.ycombinator.com/item?id=22865357&p=4
(Posts like this will go away once we turn off pagination.)
struct foo { int a; void *p; };
struct foo f = {0}; // legal C, f->p initialized like a static variable
struct foo f = {}; // not legal but supported by gcc
To me it would make sense that there is no need to specify a value for any of the members that are intended to be initialized exactly like static variables (and the first member is not special so I shouldn't have to explicitly assign a zero?). However the syntax currently demands at least one initializer.--
2. I recall seeing a proposal for allowing declarations after case labels:
switch (foo) {
case 1:
int var;
// ...
}
This is currently not allowed and you'd have to wrap the lines after case in braces, or insert a semicolon after the case label. Is this making it to c2x?--
3. I've run into some recent controversy w.r.t. having multiple functions called main (and this has come up in production code). In particular, I ran into a program programs that has a static main() function (with parameters that are not void or int and char[]), which is not intended to be the* main function that is the program's entry point.
gcc warns about this because the parameters disagree with what's prescribed for the program entry point. It's not clear to me whether this is intended to be legal or not.
--
4. Looking at the requirements for main brings up another question: it says how main should be defined (no static or extern keyword). However, the definition could be preceded by a static declaration, which then affects the definition that follows:
If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern.
For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration.
Therefore, it is possible to have a main function with internal linkage and a definition that exactly matches the one given in the spec:
static int main(int, char *[]);
int main(int argc, char *argv[]) { /* ... */ }
As one might guess, this program doesn't make it through the linker when compiled with gcc. Is this supposed to be legal? Should the spec perhaps require main to have external linkage, and then allow other functions called main with internal linkage (and parameters that do not match what is required of the external one)?EDIT: ---
Are the fixes w.r.t. reserved identifiers going to make it in c2x? Can I finally have a function called toilet() without undefined behavior?
struct foo {
enum { t_char, t_int, t_ptr, /* .. */ } type;
int count;
union {
char c[];
int i[];
void *p[];
/* .. */
};
};
This isn't allowed, since flexible array members are only allowed in structs (but the union here is exactly where you'd put a flexible array member if you had only one type to deal with).Furthermore, you can't work around this by wrapping the union's members in a struct because they must have more than one named member:
struct foo {
enum { t_char, t_int, t_ptr } type;
int count;
union { /* not allowed! */
struct { char c[]; };
struct { int i[]; };
struct { void *p[]; };
};
};
But it's all fine if we either add a useless dummy variable or move some prior member (such as count) into these structs: struct foo {
enum { t_char, t_int, t_ptr } type;
int count;
union { /* this works but is silly and redundant */
struct { int dumb1; char c[]; };
struct { int dumb2; int i[]; };
struct { int dumb3; void *p[]; };
};
};
Of course, you could have the last member be union { char c; int i; void *p; } u[];
but then each element of u is as large as the largest possible member which is wasteful, and u can't be passed to any function that expects to get a normal, tightly packed array of one specific type.I appreciate the original simplicity of K & R, "The C Programming Language", 2nd Edition, and the relatively simple semantics of ANSI C89/ISO C90 compared to C99 and later.
You don't need complex parsing methods for ANSI C89/ISO C90 and you do not need the "lexer hack" to handle the typedef-name versus other "ordinary identifier" ambiguity.
A surprising number of colleges still teach K & R 2nd Edition C.
Whenever someone brags about using recursive-descent parsing methods, I always ask, are they using predictive, top-down parsing, or back-tracking?
I hope C never loses sight of it's roots nor morphs into C++ under the guise of creating a common subset, but which is really a disguised superset of C and C++
Please prevent the ever increasing demand for new features from overwhelming C's simplicity so it can no longer be parsed with simple methods.
From the outside, after Annex K adoption failure, WG14 doesn't seem to be willing to make C safer in any way.
Are there any plans to take efforts like Checked C in consideration regarding the future of ISO C?
In my view C and C++ now almost different languages with a different philosophy of programming, different future, and different language design.
It will be sad if "modern" C++ almost replace C. Many C++ developers use "Orthodoxy C++" https://gist.github.com/bkaradzic/2e39896bc7d8c34e042b, and this shows that people will be more comfortable with C plus some really useful features(namespaces, generics, etc), but not modern C++. I very often hear from my job colleagues and from many other people who work with C++ is how terrible modern C++ (https://aras-p.info/blog/2018/12/28/Modern-C-Lamentations/, https://www.youtube.com/watch?v=9-_TLTdLGtc) and haw will be good to see and use new C but with some extra features. Maybe time to start thinking about evolution C, for example:
- Generics. Something like generics in Zig, Odin, Rust. etc.
- AST Macros. For example Rust or Lisp macroses, etc.
- Lambda
- Defer statement
- Namespaces
What do you think?https://ziglang.org/documentation/master/#Generic-Data-Struc...
https://odin-lang.org/docs/overview/#parametric-polymorphism
By passing -Wl,--wrap=some_function at link time with test code we can then define
__wrap_some_function
that will be called instead of some function. Within __wrap_some_function one can also call __real_some_function which will resolve to the original version if you still want to call the original one. This is especially useful if trying to observe certain function calls in tests that interact with hardware.Do you have any other recommendations/preferences to help with unit-testing C code?
- Basic type inference to reduce keystrokes, and prevent ripples when changing types. (like auto in C++)
- Equality operators defined for structs. Perhaps even lexicographical comparison, if I'm dreaming.
Any thoughts on either of those?
Does the committee have any plans to make NULL pointer arguments to memcpy non-UB when the size argument is 0?
int f (a, n) int n; int a[n][n]; { return a[n-1][n-1]; }
How could one define this function without using the obsolete syntax?
Numeric overflows in things like calculation of buffer sizes can lead to vulnerabilities.
Signed overflow is UB, and due to integer promotion signs creep in unexpected places.
It's not trivial to check if overflow happened due to UB rules. A naive check can make things even worse by "proving" the opposite to the optimizer.
And all of that is to read one bit that CPUs have readily available.
Maybe also cover some means, algorithms, and code for reporting on the state, status, etc. of the memory use by malloc() and free().
By the way, I know and have known well for longer than most C programmers have lived JUST what the heap data structure, as used in "heap sort", is. But what is the meaning of "the heap" in C programming language documentation?
(2) Cover in overwhelmingly fine detail the "stack" and the chuckhole in the road, stack overflow.
(3) Where to get a reliable package for a reasonable package of code for handling character strings -- what I saw and worked with in C is not reasonable.
(4) From the C programming I did, it looks like a large C program for significant work involves some hundreds, maybe tens of thousands, of includes, inserts, whatever, and what a linkage editor would call external references. There must somewhere be some tools to help a programmer make sense of all those includes and references, the resulting memory maps, issues of locality of reference, word boundary alignment, etc.
(5)How can C exploit a processor with 64 bit addressing and main memory in the tens of gigabytes and maybe terabytes?
(6) How can C support, i.e., exploit, integers and IEEE floating point in 64 and/or 128 bit lengths?
(7) How to handle exceptional conditions with, say, non-local gotos and without danger of memory leaks?
(8) Sorry, but far and away my favorite programming language long has been and remains PL/I, especially for its scope of names rules, handling of aggregates with external scope, its data structures, and its exceptional conditional handling with non-local gotos and freeing automatic storage and, thus, avoiding memory leaks. Of course I can't use PL/I now, but the problems PL/I solved are still with us, also when writing C code. So, how to solve these problems with C code?
(9) For C++, please explain how that works under the covers. E.g., some years ago it appeared the C++ was defined as only a source code pre-processor to C. Is this still the case? If so, then explaining C++ under the covers should be feasible and valuable.
I am a beginner level programmer and C is not one of the languages for which I have even bothers to write a "hello world" for. That is my level.
As the people that "runs" C, why do we need C? Forget the legacy systems. With fancy languages like Go, Rust, Elixir, Python and the millions others. Of course, the "offsprings" like C++ & C#.
What was the use case that C was designed for (I have read from sources like Wikipedia, would love to hear straight from source)? In 2020, how relevant is C? If someone is going to write a system/application today, why consider C? Do you think, C will be relevant in 5 yrs (I know 1 yr in computing is like 10 yrs for humans)? With all your combined experience in computing over the years and as the members of a team that is guiding a valuable thing like "C". What is your advice/wisdom/thought for us?
Will this be addressed in future revisions of the C standard?
I would suggest that the Standard define directives to demand three modes, with the proviso that a compiler may reject code which demands a mode it cannot accommodate:
1. clang/gcc mode, which would be adjusted to match the way clang and gcc actually behave, as well as anything they want to do but their interpretation of the Standard woudln't allow.
2. precise mode, which behaves as though all loads and stores of objects whose address are taken behave according to a precise memory-based abstraction
3. sequence-based mode, which would allow compilers to hoist, defer, consolidate, and eliminate loads and stores in cases where they honor data dependencies that are visible in the code sequence, but would require that compilers recognize visible dependencies which clang and gcc presently ignore, and would also require that the definition of "based on" used by "restrict" recognize that any pointer formed by adding or subtracting an integer from another pointer by recognized as "at least potentially based on" the former, even in corner cases where clang and gcc would ignore that.
Recognizing mode #1 would avoid allow clang and gcc to keep using their aliasing logic with programs that can tolerate it. Mode #2 would ensure that all programs that have trouble with that logic could have defined behavior by adding a directive demanding it. Mode #3 would allow most of the same useful optimizations as mode #1, but work with a wide range of programs that would presently require `-fno-strict-aliasing`.
If one recognizes the need for different modes, the effort required to describe all three modes would be tractable, compared to the obviously-intractable problem of reaching consensus about how one mode that would need to serve all purposes.
- Are you planning any addition regarding modeling of how modern CPUs work (e.g. pipelines, branches, speculative execution, cache lines, etc)?
PS: Thank you for doing this!
First off thank you so much for taking the time to answer questions.
As a new programmer starting with C I am trying to learn how to go from a beginner to an intermediate any recommendations of projects to help learn C?
It is difficult for me to find projects that I see are "valuable" for a lack of a better term.
Thank you!
For instance, a lot of redundant code (or ugly macro business) could be neatly replaced by function templates. Even just template functions with only POD values allowed would be a great readability improvement.
Why not mandate a warning every time the compiler detects and makes use of UB? It would solve SO many issues. If you are looking to improve security of C programs, then letting the user know what the compiler does should be number one.
Try to convert as many UB's to Platform specific, as possible would also be a big help.
I would love to see native vector types. Its time. Vector types are now more common in hardware then float was when it was included in the C spec. Time to make it a native type. Hoping the compiler does the vectorization for you is not good enough.
Allow for more then one break.
for(i = 0; i < n; i++) for(j = 0; j < n; j++) if(array[i][j] == x) break break;
is equal to:
for(i = 0; i < n; i++) for(j = 0; j < n; j++) if(array[i][j] == x) goto found; found :
Have you considered adding access to structure members by index or by string name? Have you considered dynamic structures?
Have you ever considered or will you consider deprecating char, int, long, (s)size_t, float, double and etc in favour of specific length types?
Will you ever add / have you considered adding [su]\d+ and f\d+ as synonyms for those mentioned stdint.h?
Since char is signed on most platforms, arm eabi being an exception and even there it's really just a matter of compile time flags, will you ever just drop char from being able to be either and just say it's signed, as int is also signed?
Will you ever define / have you considered defining signed overflow behaviour?
"m" is the same as "w", but does not truncate the file. In POSIX terms, it doesn't add O_TRUNC to the flags.
There is "r+", of course; but "r+" requires that the file exists already. In POSIX terms, "r+" does not include the O_CREAT flag.
fopen("foo", "m") creates the file if it does not exist, and opens it for writing. The stream is positioned at the beginning of the file without truncating it.
We can sort of emulate it with fopen("foo", "a"), then fclose, then open with "r+".
How I can find help for this?
1) namespaces, so function names don't need to be 30 characters to avoid naming collision
2) guaranteed copy elision or RVO -- provides greater confidence for common idioms and expressivity compared to passing out parameters
Is it fair to assume that hardware-related decisions occur in an environment where members who are sponsored by vendors argue their employers case, rather an a neutral one?
---
[1] E.g., because some hardware's behavior may more naturally implement the operation.
For example, I have a arbitrary number of includes, each of them declare a struct that need to be listed later on.
#define MOD_LIST // start with an empty list
#include "mod/a.c"
// MOD_LIST is: a,
#include "mod/b.c"
// MOD_LIST is: a,b,
Module modules[] = {
MOD_LIST
} _If, _Ifdef, _Ifndef
inside function macrosfor example:
#ifdef SOME_CONST
#define WHATEVER(w, h, a, t, e, v, e, r) \
... common part ... \
... for SOME_CONST ... \
... common part continued ...
#else
#define WHATEVER(w, h, a, t, e, v, e, r) \
... common part ... \
... when SOME_CONST not defined ... \
... common part continued ...
#endif
With _Ifdef, the above could be written like: #define WHATEVER(w, h, a, t, e, v, e, r) \
... common part ... \
_Ifdef(SOME_CONST, \
(... for SOME_CONST ...) , \
(... when SOME_CONST is not defined ...)
) \
... common part continued ...
With these, one could also do: #define FACTORIAL(n) _If(n == 0, 1, (n) * FACTORIAL(n))
int f = FACTORIAL(6);
turns into:
int f = (6) * (5) * (4) * (3) * (2) * (1) * 1;That would be very useful, I think. It might help with code duplication in function macros.
Maybe _Switch/_Case thereafter.
- Why no sized text strings?
- Why is there no hash data type?
- Where's the linked list?
- Why no package management as part of ecosystem?
What is the modern rationale?
Caveats:
- I'm not implying any need for object-orientation (OOP)
- I'm fully aware I can write these myself and can access third party libraries that have each laboriously implemented their own versions.
- I'm interested in why these are not native C constructs in 2020. I appreciate why not in 1980.
I tend to use glib for my (academic) code for pretending C is a high-level language. It also seems to make up for implementation-dependent functions in C and many portability issues. Also, IMO, vala > C++.
My question is, really, are there any other tools for high-level C programming and do you know of any disadvantages of the Gnome stack?
This might not be too deep a question on the C language in regards to this book, but I've been wondering, why did you decide to have an eldritch horror as the book's cover?
For instance, the macOS clang environment does not define this symbol. Is their implementation of wchar_t or <wctype.h> lacking some aspect of Unicode support?
I could use a union type, but that adds extra memory operations, and is finicky.
Is there a better way?
# from Chp2 of Effective C
1) Are there any plans or discussions on having a subset/extension of C that is designed for formal verification? Much like SPARK with ADA.
2) Is there no plan to support GC? Even as an extension of C?
Anyway, here's some questions:
- What kind of programs would you say C is a good fit for?
- There is some catching up to do for C. Is there a roadmap for C improvement, or even a recommendation of C++ things that fit somewhat in the style/philosophy of C? For example, I'd recommend not using the C++ smart pointers stuff, while still using C++ threads and lambdas.
Also, you should include programmers from other fields in your committee. Game (engine) developers, HFT programmers are used to lower level styles of coding and align with your perspective.
If there exists any memory block allocated using malloc() / calloc() / realloc() which has not been free()'d, at the end of the program, they would be free()'d automatically.
One can easily do it with keeping a linked list and using atexit(), but, can it be added to the standard?
A general question, will anything, any feature, which is "easy" to implement in pure C, like array knowing its own length or pascal strings, NOT be allowed to be in C standard, even if it is widely used, maybe almost everywhere?
void callback(int x, void *) // VOID STAR UNUZED, SO ANON
{
foo(x);
}2. Why it is harder to find lgpl licenced libraries to access windows directories over network like jcifs pysmb (and libraries overall) when needed to close most part of software source to sell small softwares to businesses?
3. If you needed to combo C with another language to do everything you need to do forever and never look back what other language would that be?
Thanks for this!
This helps bringing these languages to embedded targets with closed toolchains (with an existing C compiler).
Will there be developments to use a subset of C as a “portable assembly” in a standard way? Like there is WebAssembly for JavaScript.
This looks to be a hell'va' good tool chain. I'm playing with as of yesterday.
The licenses of the majority of third-party libraries available for C are GPL, do you think this makes harder reusing code to sell software?
Big question, how to start programming in C on a high professional level for somebody self schooled in it? Is there a way to cut the corner, without having to go through 10+ years trial and error to gain experience?
Anything for somebody ready to sit, study, and practice for a few hours a day?
How likely would the standard be to accept a proposal to add compile time reflection to the preprocessor, or even adopt C++'s constexpr?
My use case is creating a global array in a header from static compound literals in multiple source files at compile time, and outside of some crazy clang-tblgen type solution, or very platform specific linker hacks, it's completely unsupported by C.
Cheers from the shadowland :)
* Why can't the learning curve be solved using tools? * Why don't we actively promote more higher level languages which are implemented in C (by fewer people)?
2. When we will get the Secure Annex K extensions?
3. When we will get mandatory warnings when the compiler decides to throw away statements it thinks it doesn't need? Like memset or assignments. Compilers are getting worse and worse, and certainly not better.
ad 1) Strings are Unicode nowadays, not ASCII. Nobody uses wchar but Microsoft. Everybody else is using utf8, but there's nothing in the standard. Not even search functions with proper casing rules and normalization. Searching for strings should be pretty basic enough.
2. The usual glibc answer is just bollocks. You either do compile-time bounds checks or you don't. But when you don't, you have to do it at runtime. So it's either the compilers job, or the stdlib job. But certainly not the users.
For reference I still use The C Programming Language by KERNIGHAN/RITCHIE and The Standard C Library by PLAUGER.
In my view what programmers need the most is good practices rather than any syntactic sugar.
I prefer C rather than any other programming language for its conciseness.
There is opportunities for any new programming language to replace C if it is at least backward compatible with K&R C SE (aka ISO C90) and provides a portable access to de facto standard hardware acceleration such as SIMD instructions for vector computing.
For now we have to write in assembly language SIMD optimized libraries in order to get the full calculation power of modern processors.
For programmers who expect C to bring them a hot drink, I would recommend them to stick with the bloated C++ framework which sometimes enlarges your p*s. :-P