Avoiding game crashes related to linked lists (opens in new tab)

(codeofhonor.com)

133 pointsadj13y ago58 comments

58 comments

41 comments · 14 top-level

pubby13y ago· 8 in thread

The reason this guy thinks std::list is buggy is because he's using it incorrectly. There's not reason to write removal functions like delete_person when they already exist with list::remove, list::erase, find, search, etc. There's no reason to use std::list<foo*> either when std::list<foo> and std::list<std::unique_ptr<foo>> work just as well.

His example code is very dubious as it looks like C-with-classes rather than C++, mostly due to the lack of RAII.

Intrusive lists are still worth knowing and using, it's just that the author's reasoning was terrible. I found the Boost.Intrusive page to be much more knowledgeable: http://www.boost.org/doc/libs/1_35_0/doc/html/intrusive/intr...

jemfinch13y ago

The reason you think this guy is using std::list incorrectly is because you're not thinking about his requirements. Instead of spending a few seconds to understand why he would keep a std::list<foo* >, you immediately ran back to HN to make a comment about it.

He's a game programmer. He's keeping pointers to instances because these entities are part of an inheritance hierarchy[0]. That level of indirection is essential to his problem and completely unavoidable. On the other hand, std::list<std::unique_ptr<foo>> wouldn't change anything at all about the particularly inefficient memory arrangement of std::list<foo* >; std::unique_ptr is, after all, just a class with one pointer member.

When smart people write things you think are obviously dumb, you should invest some additional time trying to understand the context of their statements before writing them off. You'll learn more, and you won't come off as a flippant junior developer.

[0] http://www.codeofhonor.com/blog/tough-times-on-the-road-to-s...

pubby13y ago

> The reason you think this guy is using std::list incorrectly is because you're not thinking about his requirements.

The reasoning and requirements for him using intrusive lists was quite clear to me. The remaining article that starts at "Using intrusive lists" I found quite interesting and insightful.

My entire argument is against his evidence backing up statements like, "If you’re a programmer who uses the C++ STL std::list, you’re doing it wrong." It's unfair to improperly use std::list, compare it to a better solution, and then make a sweeping generalization. That's just outrageous.

1 more reply

jd13y ago

I don't think it's fair to say the author's reasoning is terrible. A blogpost is always written for some specific audience, and in this case it looks like the blog post was targeted at programmers generally, not C++ experts. I agree that constructs like std::some_container<T*> are almost never the optimal solution. And yes, with RAII locks can be freed in the destructor, which rules out a class of mistakes.

The main things to take away from the blogpost were, I think, the concept of intrusive lists, to think about what objects look like in memory, and how you can construct code so that an object can be part of multiple containers and can remove itself from all of them when it's destroyed. Those are certainly concepts worth talking about, and so I think the blogpost was pretty great.

gsg13y ago

If you read the article with more sympathy, you will notice that he is using RAII, I think rather cleverly, to automate removal of an object from the intrusive lists contained within it. This is a interesting improvement over the venerable list.h style approach most C programmers take.

As for std::list... it's useless.

SeanDav13y ago

If you bothered doing a bit of research rather than jumping on some sort of bandwagon you would have realized:

- "this guy" is actually a highly experienced and competent programmer.

- He programs games, with often very specific domain requirements and challenges. Try to understand context before making sweeping generalizations.

pubby13y ago

Where did I make a sweeping generalization? I definitely where the author made some!

bradleyjg13y ago

This article is one in a series dealing with problems that arose out of programming Starcraft in the mid-90s. There was no unique_ptr back then, and it was rare to find even junior programmers that were native c++'ers.

pubby13y ago

Actually this seems to be from Guild Wars which was from 2005. IIRC when he talked about Starcraft developement the team just used pointer structs with no abstractions.

Negitivefrags13y ago· 7 in thread

Before you consider intrusive lists, in fact, before you consider almost any other data structure, try a vector.

People grossly overestimate the cost of vector relocation while underestimating the benefits of cache coherency. Vector relocation is so cheap because all the bits are in the cache. You can relocate all the elements in a reasonably sized vector faster than you can dereference a pointer to somewhere in memory that isn't in cache.

If you want log(n) lookup like a set/map you can keep the vector sorted. If you want fast erase (and don't care about order) you can move the last element to the middle and resize down by one (no memory allocation).

The <algorithm> header makes these operations trivial.

C++11 move semantics mean that there are a lot of types you can keep by value in a vector that you wouldn't have before and they stay fast.

gsg13y ago

Vector is a fine data structure, but doesn't do the same job.

The purpose of intrusive linked lists is to provide constant time, allocation-free insert/remove from multiple sequences that all refer to the same object (identity, not value), given only a pointer to the object.

Multiple vectors obviously don't provide this, since vectors contain objects by value. Multiple vectors of pointers can work, but then removal requires either an iterator or a linear scan for each vector. If your object is in many lists at once, that gets pretty messy.

In particular the way that game engines use this, with objects packed in preallocated arrays, gives constant time insertion and deletion from multiple sequences with absolutely zero allocation or copying at any time. Vectors simply do not provide this functionality.

codeka13y ago

The only problem with std::vector is that you can only belong to one (unless you have a vector of pointers, but that almost negates the whole point).

The way I've typically done this is by having one global "all entities" std::vector and then lists, maps or whatever for specific subsets.

Usually, my "entity" object is little more than a container for behaviours so in reality its a little more complicated than that...

nostrademons13y ago

Traversing a vector of pointers isn't much less efficient than traversing an intrusive linked list, and is significantly more efficient than traversing a normal linked list. With a vector of pointers, you need to do an arithmetic operation (usually on a register), a dereference to L1 cache (to fetch the pointer), and a dereference to main memory (to fetch the object). With an intrusive linked list, it's just a dereference to main memory. With a normal linked list, it's 2 dereferences to main memory, one for the link and one for the object. On most processors, accessing cache and performing address arithmetic takes a tiny fraction of the time that accessing RAM does, so for practical purposes your vector-of-pointers and intrusive linked list implementations should be pretty similar in performance.

If you can own the objects in one by-value vector, that's even better, then traversing over them involves only address arithmetic on objects that are already probably in the cache.

1 more reply

huhtenberg13y ago

Sure, but an ordered list with O(1) removal by an element reference is pretty damn useful (as indirectly confirmed by an abundance of lists in the Linux kernel).

SideburnsOfDoom13y ago

Excellent response, and it's also why the standard class called "List" in many toolkits (Delphi and .Net for e.g.) are equivalent to a C++ vector - essentially a smart array with capacity management ( http://msdn.microsoft.com/en-us/library/y52x03h2.aspx ).

jon613y ago

I agree, although I have never had a performance bottleneck related to std::list that required a move to std::vector. Do you know at what point std::set becomes more optimal (in some dimension) than std::vector?

Negitivefrags13y ago

Bjarne Stroustrop showed in a presentation a while back that std::list never becomes more optimal than std::vector regardless of size.

Edit: Sorry, I read your comment incorrectly. Instead of std::set I would use a sorted vector without thought for anything up a thousand elements or so. After that it's probably worth profiling. Vector is probably still good even quite a good way after that depending on your use case.

1 more reply

beder13y ago· 5 in thread

There's plenty of discussion relating to the article he references at http://news.ycombinator.com/item?id=4455225, and much of the same applies here.

1. He's not comparing apples-to-apples between `std::list` and his intrusive linked list; the proper analogue would be to unlink a node from its iterator, for which the running time is still O(1).

2. The main (only?) reason to use intrusive lists is for the single indirection (which helps both memory and speed). In his example for how using std::list would crash his server because of the O(N) node removal, he's just not storing the right data for each connection (again, use an iterator).

3. He looked at boost's intrusive list, but I'm guessing he didn't actually try it out. The examples are pretty good, and it's much easier than it "looks". (That is, boost libraries can look intimidating when you first look at them because they're so template-heavy.)

4. It may even be that a vector, plus an auxiliary data structure for lookup, may be faster.

shrughes13y ago

1. Yes.

2. Also the safety. You can include assertions that the object is not a member of a linked list, when you add it to the linked list.

3. A general rule of thumb when using C++ is that you shouldn't use Boost. Boost is a source of major headaches -- even in innocuous looking parts like smart pointers. There are many reasons to avoid it in general, the most important one being compile times, because boost developers are terrible at avoiding creating long compile times. You can reimplement a boost header (for example, for uuid, shared_ptr, intrusive_ptr, and so on) and get a significant compile time speed-up. The other problem with boost libraries is that they lack assertions that are appropriate for software development. For example, with a scoped_ptr, it is better to have an init(T *) method that asserts the object has not been initialized, not just a general reset method, because with most uses of scoped_ptr<T>::reset() (in code that I've worked with) you're expecting it to currently be empty.

(Edit: Also, boost libraries are intimidating because the documentation is bad.)

4. Maybe.

quotemstr13y ago

> A general rule of thumb when using C++ is that you shouldn't use Boost.

That's not a rule of thumb. It may be something you do, and a reason I'd never work with you. Boost is a wonderful library that IME solves far more problems than it creates. Compile times aren't a problem with modern compilers, and your scoped_ptr example is nonsense.

1 more reply

huhtenberg13y ago

THE reason to use "intrusive" containers is to let a piece of data to sit in multiple containers, none of which is primary. I'll give you an skbuff and you show me how to put it on several linked lists and a couple of hashmaps with STL-style containers.

shrughes13y ago

That's not the only reason. Another is simplicity.

morsch13y ago

Store multiple pointers to the same datum?

1 more reply

xtdx13y ago· 4 in thread

One thing about intrusive lists, you have to know and specify every list the item may be on. Maybe you like that, maybe you don't.

Also, it doesn't appear the provided code gracefully handles removing an item from a list twice.

chadnickbok13y ago

I believe this was one of the main things he was actually trying to achieve - By embedding the linked list declarations inside the object, you can make sure to 'balance' all of them in destructors, and from a single place you can ensure that the object is never deleted without all references being removed.

I feel like this kind of problem is actually a specific instance of a much larger problem - that being explicit about these things leads to more maintainable code, at the expense of some 'leet things' that can cause troubles if not implemented perfectly.

kevingadd13y ago

He mentions that it's fine to call Unlink twice. Presumably the first Unlink zeroes out the prev/next fields.

xtdx13y ago

Indeed, I may have read the code wrong. But look at line 188.

    m_prevLink->m_nextNode = m_nextNode;

I don't see where m_prevLink is changed. If the previous link has gone away, and you call RemoveFromList on this node a second time, it's going to chase that pointer.

1 more reply

gsg13y ago

There's a better trick: point the link structure at itself. This avoids any need to test for null.

kev00913y ago· 1 in thread

What's funny is he mentions the ZMQ in C entries as an impetuous but then essentially writes the list the way any novice C programmer _would_ (vs. expert/"leet" C++ prgorammers from the article). To me, this unwittingly plays into the "why ZMQ would be better in C" meme far more than the other way around :o)

See also <sys/queue.h>.

ChuckMcM13y ago

I think that is somewhat intentional. C++ obfuscation gives folks a false sense of security about what they are doing. C programmers come out of school realizing they have to be careful. Nothing quantifiable mind you, just my experience in hiring them.

ggchappell13y ago· 1 in thread

This points out a real problem, but It think it seems a bit confused about what the problem actually is.

In particular, this isn't something "wrong" with std::list. He has a situation where he wants an object to manage its own membership in a mutable container. He says you can't do this efficiently with a simple non-intrusive linked list. He is right. You also can't do it efficiently with an array (std::vector, in this context).

You can do it efficiently with an intrusive linked list, as he points out. You can also use a non-intrusive linked list in which each object holds an iterator to itself. Or you can use an associative structure (std::map, std::unordered_map), in which each object holds its own key.

The instrusive linked list solution is going to have the fastest container insert & delete operations of all of these. But that doesn't mean it is the best solution for every circumstance.

Another point to be made, which he kinda-sorta gets at, is that it is a good idea to know how to code a linked list. The bulk of data structure decisions are just figuring out what already written package to use. But there is definitely still a place for a custom, application-specific linked list, and these are not difficult to write.

Evbn13y ago

And yet the classic old "code a linked list" problem is now a reviled and banned interview question at enlightened tech firms....

jheriko13y ago· 1 in thread

'best defence, no be there'

linked lists are seldom the right answer, contiguous blocks of memory are cache - and therefore algorithm - friendly.

the stl performance is usually quite easy to beat in special cases as well (i.e. all game code). some stls are terrible as well - the ms one is riddled with locks and all sorts of sledgehammer thread safety measures, which you just don't need if you know which bits of code are threaded and which arent.

gsg13y ago

Linked lists are appropriate when you need (usually multiple) sequences of pointers to objects: thus, the real alternative is not vector of t but vector of pointer to t.

A vector of pointers gives no contiguity advantage over an intrusive list during traversal (or any other operation).

eps13y ago

Ah, the potent mix of the offsetof and C++. It takes one junior dev to sprinkle a bit of inheritance on top and wonderous things will start happening in your crash-proof code. In other words, it takes much more displine to use embedded data containers instead of embedding ones, the C-style discipline, which is clearly not for everyone.

fusiongyro13y ago

I'm not a game programmer and I seldom use C or C++, but I don't find the article particularly convincing. If the motivation is to reduce bugs caused by additional allocations, he wouldn't be avoiding boost or suggesting a copy-and-paste job with his off-the-cuff locking regime. If the motivation is really performance, it seems like one should consider other data structures such as std::vector. Part of the reason to use std::list et. al. is to benefit from the STL functions—hand-writing remove, find, sort, etc. is additional code which will need to be made correct and performant and maintained internally.

I understand game programming is a very different enterprise with very different tradeoffs and motivations, but I don't feel well-sold for "pedestrian" application development.

ericbb13y ago

(A discussion of Linux data structures, including lists).

ioquatix13y ago

While valid and useful in specific cases, I think that this approach is short-sighted in general. In-place algorithms can lead to significant problems as it is typically hard to enforce strong invariants during the game update loop. In many cases, you wouldn't be erasing elements except as a final step in your game update loop anyway, and you can normally do this as part of a loop where you'd have access to the non-intrusive iterator which can then be removed O(1). I see little benefit to using intrusive linked-lists in this context.

btmorex13y ago

I think there is a subtle bug in his second two person::~person examples. Specifically, from the perspective of other threads an object is no longer valid once its destructor has been called. So, imagine two person objects side-by-side in a linked list. If they are both deleted simultaneously and both destructors get called at the same time, they can no longer safely unlink from each other even with locking because they are technically no longer valid.

The first two examples don't have this problem though.

cpeterso13y ago

EA open-sourced their EASTL game-optimized container library back in 2007, including intrusive lists. Here is a detailed introduction:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n227...

And a github repository that is still maintained:

https://github.com/paulhodge/EASTL

Luyt13y ago

Reliability is more important than speed, and if you’re reduced to using those hacks for speed-gains your program needs help. Remember Y2K bugs!

I think the reason for dropping centuries from dates was not to gain speed, but to save two bytes. In the 70's of previous millennium, two bytes of storage would cost a lot more than today, and also space on punch cards was limited.

j / k navigate · click thread line to collapse

58 comments

41 comments · 14 top-level

pubby13y ago· 8 in thread

His example code is very dubious as it looks like C-with-classes rather than C++, mostly due to the lack of RAII.

jemfinch13y ago

[0] http://www.codeofhonor.com/blog/tough-times-on-the-road-to-s...

pubby13y ago

> The reason you think this guy is using std::list incorrectly is because you're not thinking about his requirements.

The reasoning and requirements for him using intrusive lists was quite clear to me. The remaining article that starts at "Using intrusive lists" I found quite interesting and insightful.

1 more reply

jd13y ago

gsg13y ago

As for std::list... it's useless.

SeanDav13y ago

If you bothered doing a bit of research rather than jumping on some sort of bandwagon you would have realized:

- "this guy" is actually a highly experienced and competent programmer.

- He programs games, with often very specific domain requirements and challenges. Try to understand context before making sweeping generalizations.

pubby13y ago

Where did I make a sweeping generalization? I definitely where the author made some!

bradleyjg13y ago

pubby13y ago

Actually this seems to be from Guild Wars which was from 2005. IIRC when he talked about Starcraft developement the team just used pointer structs with no abstractions.

Negitivefrags13y ago· 7 in thread

Before you consider intrusive lists, in fact, before you consider almost any other data structure, try a vector.

The <algorithm> header makes these operations trivial.

C++11 move semantics mean that there are a lot of types you can keep by value in a vector that you wouldn't have before and they stay fast.

gsg13y ago

Vector is a fine data structure, but doesn't do the same job.

codeka13y ago

The only problem with std::vector is that you can only belong to one (unless you have a vector of pointers, but that almost negates the whole point).

The way I've typically done this is by having one global "all entities" std::vector and then lists, maps or whatever for specific subsets.

Usually, my "entity" object is little more than a container for behaviours so in reality its a little more complicated than that...

nostrademons13y ago

If you can own the objects in one by-value vector, that's even better, then traversing over them involves only address arithmetic on objects that are already probably in the cache.

1 more reply

huhtenberg13y ago

Sure, but an ordered list with O(1) removal by an element reference is pretty damn useful (as indirectly confirmed by an abundance of lists in the Linux kernel).

SideburnsOfDoom13y ago

jon613y ago

Negitivefrags13y ago

Bjarne Stroustrop showed in a presentation a while back that std::list never becomes more optimal than std::vector regardless of size.

1 more reply

beder13y ago· 5 in thread

There's plenty of discussion relating to the article he references at http://news.ycombinator.com/item?id=4455225, and much of the same applies here.

1. He's not comparing apples-to-apples between `std::list` and his intrusive linked list; the proper analogue would be to unlink a node from its iterator, for which the running time is still O(1).

4. It may even be that a vector, plus an auxiliary data structure for lookup, may be faster.

shrughes13y ago

1. Yes.

2. Also the safety. You can include assertions that the object is not a member of a linked list, when you add it to the linked list.

(Edit: Also, boost libraries are intimidating because the documentation is bad.)

4. Maybe.

quotemstr13y ago

> A general rule of thumb when using C++ is that you shouldn't use Boost.

1 more reply

huhtenberg13y ago

shrughes13y ago

That's not the only reason. Another is simplicity.

morsch13y ago

Store multiple pointers to the same datum?

1 more reply

xtdx13y ago· 4 in thread

One thing about intrusive lists, you have to know and specify every list the item may be on. Maybe you like that, maybe you don't.

Also, it doesn't appear the provided code gracefully handles removing an item from a list twice.

chadnickbok13y ago

kevingadd13y ago

He mentions that it's fine to call Unlink twice. Presumably the first Unlink zeroes out the prev/next fields.

xtdx13y ago

Indeed, I may have read the code wrong. But look at line 188.

    m_prevLink->m_nextNode = m_nextNode;

I don't see where m_prevLink is changed. If the previous link has gone away, and you call RemoveFromList on this node a second time, it's going to chase that pointer.

1 more reply

gsg13y ago

There's a better trick: point the link structure at itself. This avoids any need to test for null.

kev00913y ago· 1 in thread