To really drive home the primitiveness of C arrays, should probably also mention that, because addition is commutative, you could also write
i[array]
and somewhat surprisingly, it will compile, and work, and it means "*(i + array)" which is equivalent to "*(array + i)"But nobody really does that, because that would be kind of insane.
putc(2["ABCDEF"], stdout);
This prints 'C'.int const& x; // C++
A reference is functionally equivalent to a const pointer. (Reference reassignment is disallowed. Likewise, you cannot reassign a const pointer. A const pointer is meant to keep its pointee [address].) The difference between them is that C++ const references also allow non-lvalue arguments (temporaries).
It is much easier to read from right to left when decoding types. Look for yourself:
- double (* const convert_to_deg)(double const x) // const pointer to function taking a const double and returning double
- int const (* ptr_to_arr)[42]; // pointer to array of 42 const ints
- int const * arr_of_ptrs[42]; // array of 42 pointers to const ints
- int fun_returning_array_of_ints()[42];
Try it out yourself: https://cdecl.org/
Hence, I am an "East conster". (Many people are "West consters" though.)
You can return function pointers:
typedef struct player_t player_t; // let it be opaque ;)
int game_strategy1(player_t const * const p)
{
/* Eliminate player */
return 666;
}int game_strategy2(player_t const * const p)
{
/* Follow player */
return 007;
}int (* const game_strategy(int const strategy_to_use))(player_t const * const p)
{
if (strategy_to_use == 0)
return &game_strategy1;
return &game_strategy2;
}Functional programming = immutable (const) values + pure functions (no side effects).
Consting for me is also a form of documentation/specification.
"East const" for life! :)
Just saying that "actually it's just array + i" makes more sense - for me(!).
To make it worse, it’s in a parser for external data in binary format, where you really shouldn’t be playing funny tricks.
Yeah. That's why people don't do things in C. It's more like most C programmers probably weren't aware of this. After your comment, we'll start to see C codebases everywhere with that.
If you want to manipulate memory directly - which is risky though sometimes useful - C is one of the best languages in which to do it. Memory addresses are numbers, and C will let you work with those numbers in whatever way you want: add, subtract, multiply, divide... and if you didn't shudder at the suggestion of dividing a pointer because there are very, very few reasons to do so then C is not the language for you!
If you don't want to manipulate memory directly, you probably shouldn't be using C; stick with a nice garbage-collected, type-safe, object-oriented, cross-platform language. If you do want to manipulate memory directly, but you want more guarantees on what you can do with pointers, try Rust.
I once proposed a backwards-compatible way out for C.[1] Basically, you get to talk about arrays as objects with a length, even if that length is in some other variable. And you get slices and references. It was discussed enough to establish that it could work, but the effort to push it forward wasn't worth it.
Slices are important. They let you do most of the things people do with pointer arithmetic. Once you have size and slices, you can still operate near the memory address level.
[1] http://www.animats.com/papers/languages/safearraysforc43.pdf
https://www.digitalmars.com/articles/C-biggest-mistake.html
except mine is much simpler :-)
The proven utility and effectiveness of this is apparent in D.
A type like yours could be added, just like a string library like SDS, but it will never happen.
That was a very informative read. I feel like I am the exact target audience: I'm coming from a programing background steeped in C# and I'm learning C from the K&R book.
https://modernc.gforge.inria.fr
The page has a link to a free PDF version.
The Descent to C - https://news.ycombinator.com/item?id=15445059 - Oct 2017 (2 comments)
The Descent to C - https://news.ycombinator.com/item?id=8127499 - Aug 2014 (15 comments)
The descent to C - https://news.ycombinator.com/item?id=7134798 - Jan 2014 (230 comments)
Worth mentioning that C is over 40 years old, and was designed to be easily portable across a range of machines that had less compute power and memory than today's smaller microcontrollers.
As a result, a lot of things were left undefined, or were designed in a way to be easy to implement rather than easy to program for.
There existed other programming languages that were better, but their compilers weren't as broadly available, and their better features came at the cost of speed, which at the time was a premium.
I'd tweak your statement a little, or even a lot: "left undefined" most often meant "left to be defined by the compiler writers to fit the architecture of the underlying hardware, in a way that would make it easy to program to beneficially exploit features of the architecture"; and (yes) in a way "that would not be very portable, and might even be subject to change between compilers".
did the underlying machine use 2's complement? did the underlying machine have addressable bytes? big endian? 8, 16, 32 or 36 bits?
These are all things you need to know to write tight efficient code in the days of slow clockspeeds and limited RAM. C let you do that without using assembly, but by using the "undefined" features of the language, because they were clearly defined locally and were features that were very important to be easy to write code for.
consider how you would implement setjump and longjump, or even printf, or efficiently unpack or serialize bits for a communications protocol, without these supposedly "undefined features", or how you would write those if those features were actually undefined. People who put strlen(str) or a divide and a mod in the control expression for a loop would know better if they understood a bit more about the undefined features.
this is in contrast btw with some other things that actually are undefined, such as what the order of evaluation would be for complex expressions making up argument lists, etc.
I'm writing this explanation not so much to explain these technical details to noobs, but rather to get the people who understand this stuff to stop throwing around the term "undefined" with regard to C because they are cooperating in the evisceration of some ideas that are really worth exploring or understanding more deeply.
There is indeed a big difference between "undefined" behavior and "implementation-defined" behavior.
For example, there's a lot of spooky "undefined behavior" around dereferencing pointers. In one famous case [1], dereferencing a pointer actually led the compiler to skip a later check on whether the pointer was NULL, because if the pointer was already dereferenced then it must have been valid.
Another classic "implementation-defined" detail is what is the size of a "char"? Nowadays we can readily assume it's 8 bits, but that wasn't so guaranteed when C was written!
Just those other languages were only available on their hosted OS.
The symbolic price for the license, availability of source tapes and the UNIX V6 annotated source code book made the rest.
1. Much of section 2 is wrong except the part about arrays representing contiguous objects. The rest is largely an implementation detail.
Zeta C and Vacietis work considerably differently, as allowed by the standard
In addition there are many (mostly obsolete now) architectures in which, when you convert a pointer to an integer, you can't perform arithmetic and convert back because a pointer isn't just an integer address; it could represent segments or support hardware tags.
> C will typically let you just construct any pointer value you like by casting an integer to a pointer type, or by taking an existing pointer to one type and casting it so that it becomes a pointer to an entirely different type.
To be fair, they do say "typically" in here, but these behaviors are (depending on the case) all either implementation defined or undefined; the C standard specifies a union as the only well-defined way to type-pun to non character types.
> The undefined-behaviour problem with integer overflow wouldn't happen in machine code; that's a consequence of C needing to run fast on lots of very different kinds of computer, which is a problem machine code doesn't even try to solve
Some architectures trap on integer overflow, which I suspect is the reason why integer overflow is undefined rather than implementation defined. Certainly compilers today take advantage of the fact that it is undefined to make certain useful optimizations, but from what I can tell of the history that's not why it was undefined in the first place.
This is why some C programmers dictate that all code must use signed integers to avoid unexpected bugs, but many others (including myself) disagree that's a good way of going about it since, as you said, it's not guaranteed to trap or do anything to help the programmer.
uint32_t foo(uint8_t x) { return x << 24; }
Yes, this is signed integer overflow, since x gets upgraded to a signed integer before the shift, if an integer is 32-bits in size, this can result in undefined behavior if the top bit of X is set. Fortunately I've never seen a compiler optimize this to stupidity.Some compilers and libraries, such as GNU, do have some improvements. For example, in GNU you can make zero-length arrays (which I use sometimes), ?: without anything in between (which I use often), and some other things.
They say there is no object-oriented programming in C. Well, C doesn't have object-oriented features, although you can still do some limited object-oriented stuff in cases where it is useful. For example, there is the stream object (called FILE); GNU has a fopencookie function to write your own implementation of the stream interface too, even though standard C doesn't have that.
Object-oriented programming is good for some things but too often is overused in modern programming, I think. You shouldn't need object-oriented programming for everything.
It is true that some of the undefined behaviour stuff is too confusing and perhaps should be changed; in some cases the compiler has options to control these things, such as -fwrapv (which I often use).
I like the string handling of C; you can easily skip some from the beginning, and can sometimes use the string functions with non-text data too, and it doesn't use Unicode.
It says "C lets you take the address of any variable you like", but this is not quite true. There is a "register" command which means that you cannot take the address of something.
I think that many things in C (both things that they mention and some that they don't) (including pointer arithmetic, no bound checking, untagged unions, string handling, not using Unicode, setjmp/longjmp, etc) are often advantages of C.
Edit: I define the level of a language as the lowest-possible feature. I guess others define level as the highest-possible feature? I don't really know who's right here.
https://en.m.wikipedia.org/wiki/High-level_programming_langu...
> Modern high-level languages generally try to arrange that you don't need to think
> or even know – about how the memory in a computer is actually organised
Modern high-level languages try to arrange that you don't need to focus on the irrelevant. If you're working on, say, an accounting system, memory layout is not part of the problem you are trying to solve.
For certain applications, C is simply too low level.
A language is too low level when it forces you to focus on the irrelevant.
For low level operations you probably cannot beat C.
It's not hard to beat it. For example, you cannot do vector operations in C. (Hence C compilers often offer extensions.)
I can't imagine living in a 3rd world country with today's electron's catastrophy.
With these in mind, Rust is lower and higher level than C at the same time.
* other than some compiler specific pragmas, but I would be hesitant to call that natively supported
It's effectively the lowest you can go without throwing portability out the window. It doesn't let you manage caches directly, but it gives you good control over memory layout in general, and that's often enough to give you good cache usage across a variety of chips.
If you want to go lower than that, you're probably looking for assembly.
Hardware is the reality. It's not very much like the Java or Python programming model. We shouldn't hide this from programmers.
A good developer will IMO select an abstraction that best matches their goals. If you’re writing a device driver then the abstraction might be assembly language. If you’re writing business logic it might be PL/pgSQL. But regardless of which abstraction you choose, you’re hiding something from someone.
I feel like C exists at a level below such concepts. Simply being able to define a function `void do_stuff(struct mystruct *obj)` opens the door to object-oriented style programming. A lot of people seem to define OOP by the presence of superficial stuff like inheritance, polymorphism etc, but really those are additional concepts that aren't useful for every program. The real difference is mutating state on the heap. So you could say C is an object-oriented language, by default, because it doesn't stop you doing this stuff, unlike a higher-level language like Clojure which simply doesn't have mutation (for the most part). Or you could say C is a functional language because if you don't explicitly pass pointers then you get copies. Really it's both and it's neither. It's whatever you want it to be.
C++ is object oriented not because it has compile time support for polymorphism or any of that other bad programming practice, but because classes have code sections that live with them, whether on the stack or in the heap, that can operate only on memory belonging to that instance of the class.
Object oriented programming is a coding style and choice. Some languages make it a first class part of the language design. It is purposefully not part of C.
However you can do OOP like things in C: a popular paradigm is to pass around pointers to structs that (should) live in the heap, and to have a number of functions which work on these structs. This is very similar in practice and mental modelling to OOP as users of C++ might know it, but is distinct in that no code ever lives in the stack or heap, and no code is restricted from operating on any of the program memory.
On a modern system you can't usually do that because of W^X rules (also on a non-x86 modern system the performance would be abysmal if you tried because why waste transistors supporting something only crazy people would want?)
So perhaps notionally in the abstract machine if I have sixteen Clowns in a C++ vector there are sixteen copies of the Clown method squirt_water_at() in the vector too, but I assure you all the compiler emits is one copy of squirt_water_at() for Clowns, to the text segment with the rest of the program code, and maybe if Clowns are virtual, a pointer to a table of such functions lives with each Clown just in case there are Jugglers and LionTamers in the vector too - although compilers can sometimes figure out a rationale for not bothering.
Was demonstrating the difference between inheritance and composition in OOP C to my junior dev just this week.