I'm teaching a class to a bunch of high school and middle school students tomorrow who've all had moderate experience with programming. I'm covering pointer basics in C as a light intro or refresher, then focusing on cool (but relatively simple / not too crazy) tricks/tips/etc (e.g. stack walking, function pointer arrays). Care to share any of your favorite small pointer tricks with me for the class?
Thanks :)
-Cam
* Using && to take the address of a jump label.
* Casting a u_int32_t over a 4-byte string (like an SMTP verb) to get a value you can switch() on.
Had to look that one up...turns out it's a GCC trick that allows you store the address of a jump label into a pointer to void. Later on, you can do "goto *ptr" to jump back to that address. Neat. See http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
(You obviously already know this...just putting it here in case anyone else hasn't heard of it and is curious)
This is awesome beyond words. If I stumbled across this in the wild I would flip flop between awe and disgust until my head exploded.
Let me guess, "SMTP verb" is not a hypothetical example?
1. Alignment. Many CPUs will trap (or worse) if accessing a 32-bit quantity at a non-aligned address.
2. Endianness. Needless to say, the 32-bit value read for a given string depends on the machine endianness.
3. Aliasing. Casting between different pointer types can result in a violation of the C aliasing rules and, with a little bad luck, incorrect results.
How do you ensure your code is portable between the architectures with different endianness ? One trick may be using htonl on the u_int32_t to get it to a canonical format, but probably there would be better approaches ?
Node * iter;
for (iter=root; iter != NULL; iter=iter->next) {
/* iter->object; */
}
A concise implementation of strlen size_t strlen(char * str) {
char * cur;
for(cur=str; *cur; ++cur);
return (cur-str);
}
Reverse a string in-place. void reverse(char * str) {
char *i,*j, tmp;
for (i=str, j=(str+strlen(str)-1); i < j; ++i,++j) {
tmp = *i;
*i = *j;
*j = tmp;
}Anyhow, when asked to write those on a blackboard, I typically do this:
size_t strlen(char* start) {
char* end=start;
while(*end) ++end;
return (end-start);
}
and void reverse(char* i) {
char* j=(i+strlen(i)-1);
for (; i < j; ++i,--j) {
*i ^= *j;
*j ^= *i;
*i ^= *j;
} for(cur=str; cur; ++cur);
should be: for(cur=str; *cur; ++cur); size_t strlen(char *s)
{
size_t i = 0;
while(*s++) i++;
return i;
} array[index] == index[array]
Not that you would actually use this, but it gave me a lot of insight into how addressing and stuff works inside the compiler. Also from this example, there's the implicit suggestion that an array can be treated as a pointer. So that leads into pointer arithmetic which can be very useful. array[index] == *(array + index) == *(index + array) == index[array]One expects the generic + in a+i to get expanded to raw addition .+. and raw multiplication .x. with s the size of elements of the array
a .+. s .x. i
One expects index + array to throw a type error because index is just a number and doesn't have an element size.So I'm guessing that the real reason that the trick works is that generic + has three methods with signatures
int + int
array + int
int + array *(array + index) == *(index + array)
Further broken down: *(&array[0] + index) == *(index + &array[0])For e.g., a constraint in an AVL tree requires that the difference in sizes of left and right subtrees be -1, 0 or 1 (just 3 values, which requires 2 bits). A 4-byte aligned pointer would be enough. =)
Good luck!
I actually write this a lot in my code. Not code that I intend to share with others, of course. But I do find it amusing about the value I put in brackets. I find it amusing that I have an OCD-like predisposition to make it a power of two. And I find it amusing how the number between the brackets has increased over the last ten years, from a frugal 64 to an opulent 1024. This, to me, is progress.
char *bufp;
uint8_t buf[1024];
if (need_sz >= 1024)
bufp = malloc(need_sz);
else
bufp = buf;
....
if (bufp != buf) free(bufp);Also, making it a power of 2 is a good idea if you are concerned about alignment.
struct packet_header { uint from_addr; uint to_addr; ushort flags; ... }
packet *p;
read(socket, somebuf, sizeof(packet));
p = &somebuf;
printf("from = %u to = %u flags = %u\n",p->from_addr, p->to_addr, p->flags);
The bigger problem with this scheme is alignment, although we appear to have outgrown architectures that will blow up when you get this wrong.
"<a href='blah'>"
gets modified during parsing to become: "<a\0href\0'blah\0>"
Your parse tree result result can then just contain pointers to "a\0", "href\0", and "blah\0" rather than doing any copying.Incidentally I only found this after trying:
char tmp[] = "cat,mat,sat";
char *t;
t = strtok(tmp,",");
while(t != NULL){
printf("%s\n",t);
t = strtok(NULL,",");
}
And getting a segfault, as 'tmp' is not writeable memory. struct name {
int namelen;
char namestr[1];
};
struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-1 + strlen(newname)+1);
/* -1 for initial [1]; +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}
return ret;
}
(From http://c-faq.com/struct/structhack.html )
Simple way of storing a string's name and length in one allocated structure.Others: virtual function tables, function pointers inside of structs that take a "this" argument effectively giving you OOP, opaque pointers to give compile- and run-time private encapsulation...
Here's a nice discussion in StackOverflow, including a bunch of C++ guys saying to just use Vectors, which ignores the entire point of getting a structure with only one memory allocation:
http://stackoverflow.com/questions/688471/variable-sized-str...
Therefore I contest that your way is the "correct" way, especially since most C code is not C99 code. Also I'd probably never use this in C++. If you want to let subsequent programmers know you're using the variable length structure, add the comment: /* unwarranted chumminess with the compiler */
void strcpy(char *s, char *d) {
while(*d++ = *s++);
}For example, if I need to call read, it's basically an invocation of a function with a 4 byte, 8 byte, and then 4 byte argument (on a 64-bit system). I have a union argument type that represents 32-bit and 64-bit parameter types at the same time. I make a call like this:
rv = abstractInvoke(&fun, args);
which, depending on arity of fun, invokes something like this (auto-generated): rv = abstractInvoke(fun, args[0], args[1], args[2]);
That three-arg function looks roughly like this (calling my argument type ``a_t'' to abbreviate): a_t abstractInvoke(struct ftype *fun, a_t a0, a_t a1, a_t a2) {
static a_t (*allfuns[])(const void*, a_t a0, a_t a1, a_t a2) = {
invoke_3_4_444, // 0
invoke_3_8_444, // 1
invoke_3_4_448, // 2
invoke_3_8_448, // 3
invoke_3_4_484, // 4
invoke_3_8_484, // 5
// [...]
invoke_3_8_888, // 15
};
return allfuns[fun->offset](fun->orig, a0, a1, a2);
}
I generate all of the functions by arity and types and then compute an array offset of them ahead of time (so read has an offset of 5 since it returns an 8 byte value and its arguments take a 4 byte, 8 byte, and then 4 byte value).Without this trick, I'd have to use very non-portable assembler to do the same invocation (OK, I'd use libffi or dyncall's already prepackaged very non-portable assembler, which I may end up with anyway) to make this function call.
It will certainly work for 90% of the common function types out there and is a pretty common trick.
int moved = (is_reading ? read : write)(fd, buf, len);We start with the turtle and rabbit pointing to the head, and then on each clock tick we advance the rabbit by one step. After a while, assuming we've neither found the end of the list nor caught the turtle, we teleport the turtle to the rabbit's position, double the length of time we're willing to wait, then go again.
I think that data structures like trees are a nice thing to continue once the fundamental properties of pointers have been discussed. (pointer to left, right and data...).
Then sketch out the operations necessary to insert elements, and then explain how pivoting the tree makes it stay performant. This gives you a lot of opportunity to use pointers-to-pointers and so on.
It's a thing that can be nicely visualized on a whiteboard, and it has relevance for students to understand things like databases or filesystems.
If you care to elaborate, you could continue explaining the pitfalls of multithreading, or locking all or parts of the tree during updates, making updates visible in a atomic way... but that gets's out of your "not too simple/crazy" limit pretty fast.
You could also make up a nice producer/consumer example where parts of data that has to be processed is passed around by pointers, stored in linked lists, sliced up/combined. Processing images (rotating tiles), or drawing fractals in parts of memory pointed to by some variable comes to mind.
If you have a struct/class with a lot of members that are usually set to zero or some other initial value, you can store them in a "lookaside" structure that is hung off a global hash table with the pointer of the original object as the hashtable key. You can then use a bitfield to keep track of which members actually have interesting data.
So -- accessing the member would look something like this:
int MyClass::get_foo() {
if (foo_set_)
return global_lookaside[this].foo;
return 0;
}Simply, you can subtract pointers. Let's say you're walking a string from the front and from the back at the same time, and want to find the length of the substring. Well, you don't have to use indeces, just do this:
int len = back_ptr - front_ptr;
You'd be surprised how often this crops up when you're using lots of pointer tricks. while (*dst++ = *src++);char digitAsChar = "0123456789"[digitAsInt%10];