But strings in BASIC are so simple. They just work. I decided when designing D that it wouldn't be good unless string handling was as easy as in BASIC.
Would you might sharing the things that you look for, from the obvious to the subtle? I would love to see some rejected push requests if possible. If I were writing C under your direction, what would you drill into me?
Thank you, it is an honour to address you here.
2. be aware of all the C string functions that do strlen. Only do strlen once. Then use memcmp, memcpy, memchr.
3. assign strlen result to a const variable.
4. for performance, use a temporary array on the stack rather than malloc. Have it fail over to malloc if it isn't long enough. You'd be amazed how this speeds things up. Use a shorter array length for debug builds, so your tests are sure to trip the fail over.
5. remove all hard-coded string length maximums
6. make sure size_t is used for all string lengths
7. disassemble the string handling code you're proud of after it compiles. You'll learn a lot about how to write better string code that way
8. I've found subtle errors in online documentation of the string functions. Never use them. Use the C Standard. Especially for the `n` string functions.
9. If you're doing 32 bit code and dealing with user input, be wary of length overflows.
10. check again to ensure your created string is 0 terminated
11. check again to ensure adding the terminating 0 does not overflow the buffer
12. don't forget to check for a NULL pointer
13. ensure all variables are initialized before using them
14. minimize the lifetime of each variable
15. do not recycle variables - give each temporary its own name. Try to make these temporaries const, refactor if that'll enable it to be const.
16. watch out for `char` being either signed or unsigned
17. I structure loops so the condition is <. Avoid using <=, as odds are high that'll will result in a fencepost error
That's all off the top of my head. Hope it's useful for you!
I was working on a RISC processor and somebody started using various std lib functions like memcpy from a linux tool chain. I got a bug report - it crashed on certain alignments. Made sense - this processor could only copy words on word alignment etc.
So I wrote a test program for memcpy. Copy 0-128 bytes from a source buffer from offsets 0-128 to a destination buffer at offset 0-128, all combinations of that. Faulted on an alignment issue in code that tried to save cycles by doing register-sized load and store without checking alignment. That was easy! Fixed it. Ran again. Faulted again - different issue, different place.
Before I was done, I had to fix 11 alignment issues. A total fail for whomever wrote that memcpy implementation.
What was the lesson? Well, writing exhaustive tests is a good one. Not blindly trusting std intrinsic libraries is another.
But the one I took with me was, why the hell isn't there an instruction in every processor to efficiently copy from arbitrary source to arbitrary destination with maximum bus efficiency? Why was this a software issue at all! I've been facing code issues like this for decades, and it seems like it will never end.
</rant>
What will be the alternative for strncpy/strncat? I thought they're a safer strcpy/strcat but now I need something to replace them.
I assume snprintf for sprintf, vsnprintf for vsprintf.
No idea what to do with gmtime/localtime/ctime/ctime_r/asctime/asctime_r, any alternatives for them too?
char buffer[2000];
strcpy(buffer, "hello", sizeof buffer);
writes "hello" and 1995 0 to the buffer.Thank you, have a great weekend!
Better comparison would be between C and Turbo Pascal strings in DOS times. TP strings were limited to 255 characters but they were almost as fast as C strings, in some operations (like checking length) they were faster, and you had to work very hard to create a memory leak or security problem using them.
I've learnt Pascal before C and the whole mess with arrays/strings/pointers was shocking to me.
Turbo Pascal wasn't released until 1983, if the wiki is to be believed.
Naturally, our teacher wisely pushed hard on figuring what you could out on paper first.
It is by far my favorite language, because it is filled with elegant solutions to hard language problems.
As a perfectionist, there are very few things I would change about it. People rave about Rust these days, but I rave about D in return.
Just wanted to say thanks (and that I bought a D hoodie).
So a severely memory limited architecture of the 70s led to blending of data with control - which is never a safe idea, see naked SQL. We now perpetuate this madness of nul-terminated strings on architectures that have 4 to 6 orders of magnitude more memory than the original PDP-11.
It's also highly inefficient, because a the length of string is a fundamental property that must me recomputed frequently if not cached.
Bottom line, unless you work on non-security sensitive embedded systems like microwave ovens or mice, there is absolutely no place for nul-terminated strings in today's computing.