Two Star Programming (opens in new tab)

(wordaligned.org)

121 pointsrix0r13y ago63 comments

63 comments

48 comments · 17 top-level

muyuu13y ago· 5 in thread

I don't see how Linus' version is better or more readable. By adding an indirection you have the equivalent of a dummy node, both ways are textbook examples of singly linked list deletion.

Note that you save no memory by not naming the implicit (in this implementation) dummy node. A pointer to a pointer remains a pointer to a pointer whether you name the intermediate pointer or not. Personally I think it's more readable to name it. Also, I prefer curr != NULL to simply putting *curr in a condition context. The compiler is going to do exactly the same.

I don't favour Linus' version because it's not readable. Most people who haven't seen this before will need to take some time to check this code. Out of the three basic ways of doing this I'd place this one dead last.

adamtj13y ago

This is exactly what Linus was complaining about.

Your argument is like trying to claim that twenty lines of string munging is more readable than a straightforward regex. That's only true if you don't really get regular expressions.

If you understood pointers well enough, you would see Linus's version as more readable. The other one has so much more unnecessary fluff that makes it extra complicated and less readable. If you don't understand pointers well enough, and "haven't seen this before", of course you'll find it confusing.

Readability is in the eye of the beholder.

betterunix13y ago

"If you understood pointers well enough, you would see Linus's version as more readable"

Sounds like an argument for not using C in the first place. If you need to understand pointers so well that the code presented in this article looks readable just to write a clean implementation of remove_if, your language is seriously lacking in expressiveness.

Data structures are best reasoned about abstractly. A linked list is either null or a pair [data, next] where next is a linked list. remove_if(list, fn) returns null if list is null; if list is [data, next], remove_if returns next if fn(data) is true, otherwise [data, remove_if(next)]. These definitions are much easier to understand than code with two levels of indirection; they are also easily restated as code, without requiring so many levels of pointers.

To put it another way, the shorter the transition between a natural language definition of a data structure/operation and the source code, the better. Since it is almost never the case that a natural definition will involve pointers, pointers should be used sparingly in an implementation. Pointers are unintuitive, easy to make mistakes with, and have been the source of most of the serious bugs we have dealt with over the past few decades; when I see the Linux kernel panic, it is usually because of a problem with a pointer.

I would rather see a few extra lines of code that show that a programmer took the time to think about what they were doing than a "clever trick" that is hard to reason about and which is probably hiding some subtle bug.

mtoddh13y ago

If you understood pointers well enough, you would see Linus's version as more readable.

I agree - and as someone who makes his living as a C programmer, I'm sure someone could easily come up with a terse piece of Ruby code that I would view as too clever, and therefore confusing, but which the vast majority of programmers on here would look at and think, "a beginner may be confused by this, but a competent Ruby programmer would not be."

I know some of this comes down to stylistic preferences, and likely some have had bad experiences maintaining code that went overboard with this sort of thing. But I also think that if these snippets of code were being discussed in a forum with a bias towards low-level development, rather than HN, which seems biased towards web development, you'd be seeing some very different responses.

1 more reply

muyuu13y ago

Both are relatively trivial. Trading an if for an indirection is not necessarily "better", it's an option.

As I said before, both ways are textbook basic singly linked list deletions. Linus' version does the dummy pointer implicitly. Generally doing things like that requires an extra moment to realise what's going on.

The smartass idea that "anything that is understandable is equally understandable" simply falls flat in the face of reality.

To claim one piece of code to be better than another you need a more scientific argument than "avoiding an if" if you add something else (an indirection) instead.

corresation13y ago

Just to step back for a moment, it's interesting that /include/linux/list.h in the Linux kernel is a bog-standard linked list implementation. There is nothing particularly clever about it.

Speaking of the word clever, I think some are clutching that a little too proudly. In this context it doesn't mean "more skilled" or "more efficient" or "technically better", it simply means doing something in a particular way because you think it is unique. There is little to no argument to be made in favour of the approach in the linked blog entry, but there are many potential risks with it, not least of which is that it adds one more potential misunderstanding by skilled practitioners. While I love pointers to bits, and love unmanaged, native code, there is a good reason why most environments now trying to reduce their usage, even with the bestest of the bestest developers manning the keyboards.

1 more reply

huhtenberg13y ago· 5 in thread

The double-pointer trick makes for a more concise code, but it doesn't always lead to the code that is faster.

That said, it's actually an excellent C interview question. Combined with "write a binary tree iterator" it reliably measures one's exposure to thoughtful C programming.

mtoddh13y ago

That said, it's actually an excellent C interview question

This was actually one of the very first questions we used to ask during interviews at Tenable. I'm honestly kind of amazed it's now seen as some sort of tricky technique. I would think (or hope) most C programmers would see it as basic competence with pointers.

npsimons13y ago

I would think (or hope) most C programmers would see it as basic competence with pointers.

Which is exactly what Torvald's was saying. All the comments here going "that's unreadable" or "it's too clever" scare me. And I don't even consider myself some C coding rockstar (took me a bit to sort out my own mental picture of pointers).

huhtenberg13y ago

"Hope" is more like it. I went to one interview where they asked to traverse a list, and since it was for a senior C position I gave them the * * version. There was this pause and then their lead dev said that this cannot possibly work. It was really awkward.

1 more reply

lucian190013y ago

It likely is faster because it avoids more branching altogether, thus branch mis-predictions.

ww52013y ago

But memory access is more expensive. Double de-referencing a pointer to pointer might perform worse.

1 more reply

kabdib13y ago· 4 in thread

The "two star" solution looks like it generates more memory traffic in the case of many deletions, and may run afoul of load-hit-store performance problems on some platforms, since the write to '*curr' in the remove case may still be in flight on its subsequent read and cause a pipeline stall.

Then again, the 'rm' predicate may dominate, or removes may be infrequent, and then who cares? Or your CPU may not care about LHS stalls (by virtue of being very whizzy, or just so terrible that the optimization doesn't matter).

Which is to say, measure and be prepared to be flexible rather than statically clever.

demallien13y ago

Yup, potential performance issues were the first things that came to my mind on seeing that code too. Plus, it's just silly to say that someone that just wrote a linked list doesn't understand pointers - pretty much by definition they do understand pointers, otherwise their linked list isn't going to work.

My personal rule of thumb concerning pointers is that two-stars should never be seen except in the case where you want a function to be able to modify a pointer that was passed in as a parameter. That's it. If I get handled a module that has two-stars in it under any other circumstance, I refactor it. I do this because:

a) lots of people find it difficult to thing about pointers to pointers (although I've never understood why empirically I note that this is the case).

b) Having to dereference two pointers to get to the information that you want is not just slow because of extra work needing to be done, but also because you've just increased the need for doing a page change in memory, which on some architectures can be surprisingly expensive.

to3m13y ago

Where does the double dereference come from in the second case? curr->next and (* curr) both perform a dereference. (If anything, (* curr) will be at least as efficient, if not more, because there's guaranteed to be no offset.)

(Does anybody know how you type an asterisk?)

4 more replies

stephencanon13y ago

> The "two star" solution looks like it generates more memory traffic in the case of many deletions.

It certainly shouldn't. There's no difference at an architectural level between keeping track of prev and keeping track of &prev->next (as should be obvious, because they differ by a constant offset): advancing in the list is just a pointer update via a load either way, and removing a link is just a store.

Since they do the same thing, the opportunity for an LHS stall may be avoided in either approach with a little care.

kabdib13y ago

Before posting the above, I compiled both pieces of code and looked at the assembly output.

Load-hit-stores are one of those pernicious things that bite you a couple times, then you start noticing them. They are indeed "architectural" to a number of processors common in video game consoles, where cycle stealers get a lot of sunlight.

2 more replies

loup-vaillant13y ago· 4 in thread

Something 's not right. Still too complex. Ah. Lists are recursive. Their treatment should be recursive as well. Let's see…

  void remove_if(node ** head, remove_fn rm)
  {
      node * entry = *head;
      if (rm(entry))
      {
          *head = entry->next;
          free(entry);
      }
      else
          head = &entry->next;
      remove_if(head, rm);
   }

Ahh, much better: Now we can refactor:

  void remove_head(node ** head)
  {
      node * entry = *head;
      *head = entry->next;
      free(entry);
  }

  void remove_if(node ** head, remove_fn rm)
  {
      if (rm(*head))
          remove_head(head)
      else
          head = &entry->next;
      remove_if(head, rm);
   }

I'm sure GCC will be able to optimize away the tail call.

Edit: oops, I forgot to test for empty lists, which was implicit in the for loop I set out to remove…

JoachimSchipper13y ago

Tail calls are considered very bad style, for good reason. gcc can indeed optimize them away, at a sufficiently high level of optimization; but it's not very nice to write code that, e.g., runs out of stack space only when compiled for debugging.

AngryParsley13y ago

"...very bad style"? That's quite an overstatement. You can enable individual optimizations with gcc. Just add `-foptimize-sibling-calls` to the debug CFLAGS and you're set. Tada! Straightforward code and debug-ability.

1 more reply

betterunix13y ago

On the other hand, for a linked list, recursion follows immediately from the definition of the list, while iteration does not. I would rather have my debugger allocate a large stack or compile the code so that TCO was enabled even when I was debugging than have to deal with implementations that do not clearly follow from a high-level design.

scscsc13y ago

Lists are recursive, right. Better than double pointer indirection is to have the function return the new head. Note that the following code is not functional, but returning the new head makes it cleaner.

    node *remove_if(node *head, remove_fn rm)
    {
        if (rm(head)) {
            return remove_if(head->next);
        }
        head->next = remove_if(head->next);
        return head;
     }

A good use of this technique (returning pointers to the resulting node) is the following code which very cleanly implements AVL trees in C (from http://infoarena.ro/multe-smenuri-de-programare-in-cc-si-nu-... -- scroll to the end (sorry, main article in Romanian)):

    #define max(a, b) ((a) > (b) ? (a) : (b))
    #define geth(n) (n->h = 1 + max(n->l->h, n->r->h))
     
    struct node
    {
          int key, h;
          struct node *l, *r;
    } *R, *NIL;
    typedef struct node node;
     
    void init(void)
    {
          R = NIL = (node *) malloc(sizeof(node));
          NIL->key = NIL->h = 0,
                  NIL->l = NIL->r = NULL;
    }
     
    node* rotleft(node *n)
    {
          node *t = n->l;
     
          n->l = t->r, t->r = n,
                  geth(n), geth(t);
          return t;
    }
     
    node* rotright(node *n)
    {
          node *t = n->r;
      
          n->r = t->l, t->l = n,
                  geth(n), geth(t);
          return t;
    }
     
    node* balance(node *n)
    {
          geth(n);
          if (n->l->h > n->r->h + 1)
          {
                  if (n->l->r->h > n->l->l->h)
                          n->l = rotright(n->l);
                  n = rotleft(n);
          }
          else
                  if (n->r->h > n->l->h + 1)
                  {
                          if (n->r->l->h > n->r->r->h)
                                  n->r = rotleft(n->r);
                          n = rotright(n);
                  }
          return n;
    }
     
    node* insert(node *n, int key)
    {
          if (n == NIL)
          {
                  n = (node *) malloc(sizeof(node));
                  n->key = key, n->h = 1, n->l = n->r = NIL;
                  return n;
          }
          if (key < n->key)
                  n->l = insert(n->l, key);
          else
                  n->r = insert(n->r, key);
          return balance(n);
    }
     
    node* erase(node *n, int key)
    {
          node *t;
          if (n == NIL) return n;
          if (n->key == key)
          {
                  if (n->l == NIL || n->r == NIL)
                  {
                          t = n->l == NIL ? n->r : n->l;
                          free(n); return t;
                  }
                  else
                  {
                          for (t = n->r; t->l != NIL; t = t->l);
                          n->key = t->key,
                                  n->r = erase(n->r, t->key);
                          return balance(n);
                  }
          }
          if (key < n->key)
                  n->l = erase(n->l, key);
          else
                  n->r = erase(n->r, key);
          return balance(n);
    }
     
    int search(node *n, int key)
    {
          if (n == NIL) return 0;
          if (n->key == key) return 1;
          if (key < n->key)
                  return search(n->l, key);
          else
                  return search(n->r, key);
    }

stiff13y ago· 3 in thread

With one of my friends at the university we used to compete for a shorter/nicer implementation of each data structure in the algorithms class we were taking, and after using pointers-to-pointers in some previous cases (linked lists, binary search trees etc.) and each time ending up with an implementation I really liked, I tried to implement AVL trees[1] the same way, for example an AVL rotation would be done by a call like this:

  avl_tree_rotate(avl, &t->left->right->left, &t->left->right, &t->left, t->left->right);

Where avl_tree_rotate looked like this (avl_tree_update just recomputes the height after left or right branch height changes):

  static void avl_tree_rotate(avl avl, V **from, V **to, V **parent, V *child) {
    if(*from) (*from)->parent = *to   ? (*to)->parent   : NULL;
    if(*to)   (*to)->parent   = *from ? (*from)->parent : NULL;
    avl_tree_swap(avl, from, to);
    child->parent = (*parent)->parent;
    if(!((*parent)->parent))
      avl->root = child;
    else if((*parent)->parent->left == *parent)
      (*parent)->parent->left = child;
    else if((*parent)->parent->right == *parent)
      (*parent)->parent->right = child;
    (*parent)->parent = child;
    *from = *parent;
    avl_tree_update(avl, *parent);
    avl_tree_update(avl, child);
  }

It was one of nicer programming workouts I ever got (the whole class was a great experience and I think every professional programmer should go through implementing all the basic data structures and algorithms in plain C one time), but I never managed to get deletion right, despite spending quite some hours on it, it just went beyond my head. Not that it matters for linked lists, and in this case I made it deliberately harder on myself, but I think the moral of the story is that when you are too clever with implementation details, you might run out of your cleverness when starting to deal with the actual problem domain.

[1] http://en.wikipedia.org/wiki/AVL_tree

ZoFreX13y ago

> I think the moral of the story is that when you are too clever with implementation details, you might run out of your cleverness when starting to deal with the actual problem domain.

That reminded me of this quote: "Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"

http://en.wikiquote.org/wiki/Brian_Kernighan

koenigdavidmj13y ago

The corollary: code stoned, debug caffeinated, and document drunk.

1 more reply

njs1234513y ago

A fibonacci heap (in almost any language) is also one I would recommend. Definitely the hardest single thing I remember doing in college.

chrisbennet13y ago· 3 in thread

Unless I really need the performance, I avoid being "clever" when coding. The simple non-2 star example is going to be a lot faster for someone to understand, maintain and debug in the future.

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." --Brian Kernighan

nicholassmith13y ago

Being clever when coding is one of the biggest headaches for long term maintenance unless you really really need that extra performance. I've had to do a bug hunt on a piece of code recently that was very, very clever but used all sorts language features that are marked 'use with caution' when a longer function with more simplicity would have worked as well.

Readability and maintainability trump developers showing how clever they are by far.

1 more reply

capisce13y ago

Whether it's faster or slower to understand depends on how well the reader is able to grasp abstraction and pointers. For some the latter code might be more readable since it has less special cases.

damian200013y ago

Came here to say the same thing. It would also result in more work for the original programmer, since they would be asked questions about how it works from other (non-2 star) programmers.

andrewcooke13y ago· 3 in thread

i had to write very similar code (deleting from a linked list) a couple of days ago. i wasn't the easiest code to write, and i needed several iterations before my tests were ok, but i remember feeling happy that the final result was clear and correct.

i just checked it and, yes, i used "two stars".

now i'm reading all the comments here, which are predominantly critical, calling this "clever" code. and it struck me just how lucky i am to work with decent programmers. where this kind of thing is considered quality work, not "showing off".

what i don't understand is why the other commenters here use pointers at all. couldn't you rewrite this to use arrays with integer indices? then you wouldn't need any stars (or very few), and it would be even "easier".

well, that's not true. i do understand. everyone thinks their own level is the correct one.

but still, i have sometimes been lucky enough to work with people who knew a lot more than i did. when i was in that position i didn't think they were "showing off". i thought they were awesome and tried to learn from them.

danabramov13y ago

>it struck me just how lucky i am to work with decent programmers. where this kind of thing is considered quality work, not "showing off".

My rule of thumb is I can write “clever” code if I make sure it would be easy to debug for me or anyone else. This usually includes indented logging for anything complicated, good choice of variable names and plenty of debug assertions that would fail with a meaningful message if something went wrong.

When the deadline comes and you're dealing with an obscure bug in a clever code that looks right but is difficult to reason about, you'll be damning it.

1 more reply

betterunix13y ago

"i had to write very similar code (deleting from a linked list) a couple of days ago. i wasn't the easiest code to write, and i needed several iterations before my tests were ok, but i remember feeling happy that the final result was clear and correct."

How was this not easy code to write? I imagine it would have looked like this, but in your language of choice:

  (defun remove-if (lst pred)
     (if (null lst)
         nil
         (if (funcall pred (first lst))
             (remove-if (rest lst) pred)
             (cons (first lst) (remove-if (rest lst) pred))
             )
         )
      )

This is straightforward and basically follows from the definition of a list: either null (empty) or a pair whose second element is a list. The C version is uglier because of the need to deallocate removed nodes, track pointers, etc., but it still follows from the definition of a list.

andrewcooke13y ago

eh, what do you want me to say? everyone else here is calling me a smartass and you're telling me i'm an idiot.

it was c. i made some mistakes. i fixed them.

2 more replies

antirez13y ago· 1 in thread

The idea that you can classify a programmer's understanding of pointers based on doing linked list tricks is absurd. Usually Linus says things that make sense, but not this time IMHO.

Also the code proposed may work for linked lists that is a very well understood topic, but code like this in general is hard to understand and debug, without actually providing some serious advantage.

I remember some code on the Linux kernel implementing the inode caching in the virtual file system layer that was a mess like that, but multiplied by 1000. Very "Smart" indeed, but for the sake of what?

corresation13y ago

I am not even sure that this is what Linus was trying to say: I extracted a very different intention from his comment than the author of this post.

twp13y ago· 1 in thread

A nice example of the old adage that "most problems in computer science can be solved by adding an extra layer of indirection".

raldi13y ago

"...except too-many-layers-of-indirection."

ruv13y ago· 1 in thread

Honestly as somebody who reads/ writes C every couple of months, I don't see what's so clever about it. This level of understanding for a C programmer should be the bare minimum, I think...

qu4z-213y ago

Which is very much what Linus was saying (and I agree).

lsiebert13y ago· 1 in thread

You can just use a dummy node and track the previous entry. Then you don't have to ever pass back the front of the list, or treat the front of the list as special at all.

It's more elegant then the double level pointer, imho.

huhtenberg13y ago

How is it more elegant if it effectively doubles the number of pointer dereferences?

jules13y ago

Is this really so amazing? I "invented" this long ago when I first implemented a binary tree. First you code the ugly version, and then in your mind or on paper you execute it and you see that there's nothing special about the root node, yet it is handled by a special case. Then you remove the special case, that's all. So keep in mind that contrary to what several people here are saying, it does not cause extra pointer dereferences.

ambrop713y ago

Just for fun I made a benchmark for "two-star" and "one-star" versions of remove_if() and build() (to link an array of nodes). Code: http://ideone.com/D7bzmo

Results with {gcc-4.7.2,clang-3.2} -march=corei7 -O3

  build_onestar+remove_if_onestar: 0m0.309s 0m0.357s
  build_onestar+remove_if_twostar: 0m0.302s 0m0.324s
  build_twostar+remove_if_onestar: 0m0.310s 0m0.361s
  build_twostar+remove_if_twostar: 0m0.302s 0m0.320s

The two-star version of remove_if() is a bit faster here, but both build() versions are the same, across both compilers.

jiggy201113y ago

Not a C programmer , but if I was asked to implement this I would be more comfortable doing something like the first. The second took me a few moments to figure out.

Perhaps this is a result of many people learning in languages like Java where the idea of double indirection isn't quite so explicit so this style feels somewhat unnatural?

I could easily imagine introducing very weird bugs by mixing up my pointers and pointers-to-pointers. To mentally visualise a linked list I find it easier to just imagine a stack of boxes attached by string and work on the basis of cutting and reattaching the string to different points, this is more difficult under the 2nd approach.

csl13y ago

Reminds me of the very funny description of a Three Star Programmer over at http://c2.com/cgi/wiki?ThreeStarProgrammer

EDIT: I see it's also linked in the article.

bjoernbu13y ago

Why is this supposed to be such a big deal? The second example uses the language's features to be able to use shorter code.

Usually there are countless possibilities for the exact same thing in every program. Personally, I would never say those who produce code with unused possibilities do "not understand" the corresponding language features.

Especially when writing example code or working with less experienced programmers, I even think that using less language features for the sake of making your algorithmic idea as obvious as possible is better practice.

For exmapel, I find it far more important that one decides wisely when to use a linked list at all or, if there are multiple fuctions for removal, chooses the right one for a particular use-case. Therefore everyone has to understand whats happening.

rjknight13y ago

This makes me miss programming in C.

j / k navigate · click thread line to collapse

63 comments

48 comments · 17 top-level

muyuu13y ago· 5 in thread

I don't see how Linus' version is better or more readable. By adding an indirection you have the equivalent of a dummy node, both ways are textbook examples of singly linked list deletion.

adamtj13y ago

This is exactly what Linus was complaining about.

Your argument is like trying to claim that twenty lines of string munging is more readable than a straightforward regex. That's only true if you don't really get regular expressions.

Readability is in the eye of the beholder.

betterunix13y ago

"If you understood pointers well enough, you would see Linus's version as more readable"

mtoddh13y ago

If you understood pointers well enough, you would see Linus's version as more readable.

1 more reply

muyuu13y ago

Both are relatively trivial. Trading an if for an indirection is not necessarily "better", it's an option.

The smartass idea that "anything that is understandable is equally understandable" simply falls flat in the face of reality.

To claim one piece of code to be better than another you need a more scientific argument than "avoiding an if" if you add something else (an indirection) instead.

corresation13y ago

Just to step back for a moment, it's interesting that /include/linux/list.h in the Linux kernel is a bog-standard linked list implementation. There is nothing particularly clever about it.

1 more reply

huhtenberg13y ago· 5 in thread

The double-pointer trick makes for a more concise code, but it doesn't always lead to the code that is faster.

That said, it's actually an excellent C interview question. Combined with "write a binary tree iterator" it reliably measures one's exposure to thoughtful C programming.

mtoddh13y ago

That said, it's actually an excellent C interview question

npsimons13y ago

I would think (or hope) most C programmers would see it as basic competence with pointers.

huhtenberg13y ago

1 more reply

lucian190013y ago

It likely is faster because it avoids more branching altogether, thus branch mis-predictions.

ww52013y ago

But memory access is more expensive. Double de-referencing a pointer to pointer might perform worse.

1 more reply

kabdib13y ago· 4 in thread

Which is to say, measure and be prepared to be flexible rather than statically clever.

demallien13y ago

a) lots of people find it difficult to thing about pointers to pointers (although I've never understood why empirically I note that this is the case).

to3m13y ago

(Does anybody know how you type an asterisk?)

4 more replies

stephencanon13y ago

> The "two star" solution looks like it generates more memory traffic in the case of many deletions.

Since they do the same thing, the opportunity for an LHS stall may be avoided in either approach with a little care.

kabdib13y ago

Before posting the above, I compiled both pieces of code and looked at the assembly output.

2 more replies

loup-vaillant13y ago· 4 in thread

Something 's not right. Still too complex. Ah. Lists are recursive. Their treatment should be recursive as well. Let's see…

  void remove_if(node ** head, remove_fn rm)
  {
      node * entry = *head;
      if (rm(entry))
      {
          *head = entry->next;
          free(entry);
      }
      else
          head = &entry->next;
      remove_if(head, rm);
   }

Ahh, much better: Now we can refactor:

  void remove_head(node ** head)
  {
      node * entry = *head;
      *head = entry->next;
      free(entry);
  }

  void remove_if(node ** head, remove_fn rm)
  {
      if (rm(*head))
          remove_head(head)
      else
          head = &entry->next;
      remove_if(head, rm);
   }

I'm sure GCC will be able to optimize away the tail call.

Edit: oops, I forgot to test for empty lists, which was implicit in the for loop I set out to remove…

JoachimSchipper13y ago

AngryParsley13y ago

1 more reply

betterunix13y ago

scscsc13y ago

    node *remove_if(node *head, remove_fn rm)
    {
        if (rm(head)) {
            return remove_if(head->next);
        }
        head->next = remove_if(head->next);
        return head;
     }

    #define max(a, b) ((a) > (b) ? (a) : (b))
    #define geth(n) (n->h = 1 + max(n->l->h, n->r->h))
     
    struct node
    {
          int key, h;
          struct node *l, *r;
    } *R, *NIL;
    typedef struct node node;
     
    void init(void)
    {
          R = NIL = (node *) malloc(sizeof(node));
          NIL->key = NIL->h = 0,
                  NIL->l = NIL->r = NULL;
    }
     
    node* rotleft(node *n)
    {
          node *t = n->l;
     
          n->l = t->r, t->r = n,
                  geth(n), geth(t);
          return t;
    }
     
    node* rotright(node *n)
    {
          node *t = n->r;
      
          n->r = t->l, t->l = n,
                  geth(n), geth(t);
          return t;
    }
     
    node* balance(node *n)
    {
          geth(n);
          if (n->l->h > n->r->h + 1)
          {
                  if (n->l->r->h > n->l->l->h)
                          n->l = rotright(n->l);
                  n = rotleft(n);
          }
          else
                  if (n->r->h > n->l->h + 1)
                  {
                          if (n->r->l->h > n->r->r->h)
                                  n->r = rotleft(n->r);
                          n = rotright(n);
                  }
          return n;
    }
     
    node* insert(node *n, int key)
    {
          if (n == NIL)
          {
                  n = (node *) malloc(sizeof(node));
                  n->key = key, n->h = 1, n->l = n->r = NIL;
                  return n;
          }
          if (key < n->key)
                  n->l = insert(n->l, key);
          else
                  n->r = insert(n->r, key);
          return balance(n);
    }
     
    node* erase(node *n, int key)
    {
          node *t;
          if (n == NIL) return n;
          if (n->key == key)
          {
                  if (n->l == NIL || n->r == NIL)
                  {
                          t = n->l == NIL ? n->r : n->l;
                          free(n); return t;
                  }
                  else
                  {
                          for (t = n->r; t->l != NIL; t = t->l);
                          n->key = t->key,
                                  n->r = erase(n->r, t->key);
                          return balance(n);
                  }
          }
          if (key < n->key)
                  n->l = erase(n->l, key);
          else
                  n->r = erase(n->r, key);
          return balance(n);
    }
     
    int search(node *n, int key)
    {
          if (n == NIL) return 0;
          if (n->key == key) return 1;
          if (key < n->key)
                  return search(n->l, key);
          else
                  return search(n->r, key);
    }

stiff13y ago· 3 in thread

  avl_tree_rotate(avl, &t->left->right->left, &t->left->right, &t->left, t->left->right);

Where avl_tree_rotate looked like this (avl_tree_update just recomputes the height after left or right branch height changes):

  static void avl_tree_rotate(avl avl, V **from, V **to, V **parent, V *child) {
    if(*from) (*from)->parent = *to   ? (*to)->parent   : NULL;
    if(*to)   (*to)->parent   = *from ? (*from)->parent : NULL;
    avl_tree_swap(avl, from, to);
    child->parent = (*parent)->parent;
    if(!((*parent)->parent))
      avl->root = child;
    else if((*parent)->parent->left == *parent)
      (*parent)->parent->left = child;
    else if((*parent)->parent->right == *parent)
      (*parent)->parent->right = child;
    (*parent)->parent = child;
    *from = *parent;
    avl_tree_update(avl, *parent);
    avl_tree_update(avl, child);
  }

[1] http://en.wikipedia.org/wiki/AVL_tree

ZoFreX13y ago

> I think the moral of the story is that when you are too clever with implementation details, you might run out of your cleverness when starting to deal with the actual problem domain.

http://en.wikiquote.org/wiki/Brian_Kernighan

koenigdavidmj13y ago

The corollary: code stoned, debug caffeinated, and document drunk.

1 more reply

njs1234513y ago

A fibonacci heap (in almost any language) is also one I would recommend. Definitely the hardest single thing I remember doing in college.

chrisbennet13y ago· 3 in thread

Unless I really need the performance, I avoid being "clever" when coding. The simple non-2 star example is going to be a lot faster for someone to understand, maintain and debug in the future.

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." --Brian Kernighan

nicholassmith13y ago

Readability and maintainability trump developers showing how clever they are by far.

1 more reply

capisce13y ago

Whether it's faster or slower to understand depends on how well the reader is able to grasp abstraction and pointers. For some the latter code might be more readable since it has less special cases.

damian200013y ago

Came here to say the same thing. It would also result in more work for the original programmer, since they would be asked questions about how it works from other (non-2 star) programmers.

andrewcooke13y ago· 3 in thread

i just checked it and, yes, i used "two stars".

well, that's not true. i do understand. everyone thinks their own level is the correct one.

danabramov13y ago

>it struck me just how lucky i am to work with decent programmers. where this kind of thing is considered quality work, not "showing off".

When the deadline comes and you're dealing with an obscure bug in a clever code that looks right but is difficult to reason about, you'll be damning it.

1 more reply

betterunix13y ago

How was this not easy code to write? I imagine it would have looked like this, but in your language of choice:

  (defun remove-if (lst pred)
     (if (null lst)
         nil
         (if (funcall pred (first lst))
             (remove-if (rest lst) pred)
             (cons (first lst) (remove-if (rest lst) pred))
             )
         )
      )

andrewcooke13y ago

eh, what do you want me to say? everyone else here is calling me a smartass and you're telling me i'm an idiot.

it was c. i made some mistakes. i fixed them.

2 more replies

antirez13y ago· 1 in thread

The idea that you can classify a programmer's understanding of pointers based on doing linked list tricks is absurd. Usually Linus says things that make sense, but not this time IMHO.

corresation13y ago

I am not even sure that this is what Linus was trying to say: I extracted a very different intention from his comment than the author of this post.

twp13y ago· 1 in thread

A nice example of the old adage that "most problems in computer science can be solved by adding an extra layer of indirection".

raldi13y ago

"...except too-many-layers-of-indirection."

ruv13y ago· 1 in thread

Honestly as somebody who reads/ writes C every couple of months, I don't see what's so clever about it. This level of understanding for a C programmer should be the bare minimum, I think...

qu4z-213y ago

Which is very much what Linus was saying (and I agree).

lsiebert13y ago· 1 in thread

You can just use a dummy node and track the previous entry. Then you don't have to ever pass back the front of the list, or treat the front of the list as special at all.

It's more elegant then the double level pointer, imho.

huhtenberg13y ago

How is it more elegant if it effectively doubles the number of pointer dereferences?

jules13y ago

ambrop713y ago

Just for fun I made a benchmark for "two-star" and "one-star" versions of remove_if() and build() (to link an array of nodes). Code: http://ideone.com/D7bzmo

Results with {gcc-4.7.2,clang-3.2} -march=corei7 -O3

  build_onestar+remove_if_onestar: 0m0.309s 0m0.357s
  build_onestar+remove_if_twostar: 0m0.302s 0m0.324s
  build_twostar+remove_if_onestar: 0m0.310s 0m0.361s
  build_twostar+remove_if_twostar: 0m0.302s 0m0.320s

The two-star version of remove_if() is a bit faster here, but both build() versions are the same, across both compilers.

jiggy201113y ago

Not a C programmer , but if I was asked to implement this I would be more comfortable doing something like the first. The second took me a few moments to figure out.

Perhaps this is a result of many people learning in languages like Java where the idea of double indirection isn't quite so explicit so this style feels somewhat unnatural?

csl13y ago

Reminds me of the very funny description of a Three Star Programmer over at http://c2.com/cgi/wiki?ThreeStarProgrammer

EDIT: I see it's also linked in the article.

bjoernbu13y ago

Why is this supposed to be such a big deal? The second example uses the language's features to be able to use shorter code.

rjknight13y ago

This makes me miss programming in C.

j / k navigate · click thread line to collapse