People grossly overestimate the cost of vector relocation while underestimating the benefits of cache coherency. Vector relocation is so cheap because all the bits are in the cache. You can relocate all the elements in a reasonably sized vector faster than you can dereference a pointer to somewhere in memory that isn't in cache.
If you want log(n) lookup like a set/map you can keep the vector sorted. If you want fast erase (and don't care about order) you can move the last element to the middle and resize down by one (no memory allocation).
The <algorithm> header makes these operations trivial.
C++11 move semantics mean that there are a lot of types you can keep by value in a vector that you wouldn't have before and they stay fast.
The purpose of intrusive linked lists is to provide constant time, allocation-free insert/remove from multiple sequences that all refer to the same object (identity, not value), given only a pointer to the object.
Multiple vectors obviously don't provide this, since vectors contain objects by value. Multiple vectors of pointers can work, but then removal requires either an iterator or a linear scan for each vector. If your object is in many lists at once, that gets pretty messy.
In particular the way that game engines use this, with objects packed in preallocated arrays, gives constant time insertion and deletion from multiple sequences with absolutely zero allocation or copying at any time. Vectors simply do not provide this functionality.
The way I've typically done this is by having one global "all entities" std::vector and then lists, maps or whatever for specific subsets.
Usually, my "entity" object is little more than a container for behaviours so in reality its a little more complicated than that...
If you can own the objects in one by-value vector, that's even better, then traversing over them involves only address arithmetic on objects that are already probably in the cache.
Edit: Sorry, I read your comment incorrectly. Instead of std::set I would use a sorted vector without thought for anything up a thousand elements or so. After that it's probably worth profiling. Vector is probably still good even quite a good way after that depending on your use case.
1. He's not comparing apples-to-apples between `std::list` and his intrusive linked list; the proper analogue would be to unlink a node from its iterator, for which the running time is still O(1).
2. The main (only?) reason to use intrusive lists is for the single indirection (which helps both memory and speed). In his example for how using std::list would crash his server because of the O(N) node removal, he's just not storing the right data for each connection (again, use an iterator).
3. He looked at boost's intrusive list, but I'm guessing he didn't actually try it out. The examples are pretty good, and it's much easier than it "looks". (That is, boost libraries can look intimidating when you first look at them because they're so template-heavy.)
4. It may even be that a vector, plus an auxiliary data structure for lookup, may be faster.
2. Also the safety. You can include assertions that the object is not a member of a linked list, when you add it to the linked list.
3. A general rule of thumb when using C++ is that you shouldn't use Boost. Boost is a source of major headaches -- even in innocuous looking parts like smart pointers. There are many reasons to avoid it in general, the most important one being compile times, because boost developers are terrible at avoiding creating long compile times. You can reimplement a boost header (for example, for uuid, shared_ptr, intrusive_ptr, and so on) and get a significant compile time speed-up. The other problem with boost libraries is that they lack assertions that are appropriate for software development. For example, with a scoped_ptr, it is better to have an init(T *) method that asserts the object has not been initialized, not just a general reset method, because with most uses of scoped_ptr<T>::reset() (in code that I've worked with) you're expecting it to currently be empty.
(Edit: Also, boost libraries are intimidating because the documentation is bad.)
4. Maybe.
That's not a rule of thumb. It may be something you do, and a reason I'd never work with you. Boost is a wonderful library that IME solves far more problems than it creates. Compile times aren't a problem with modern compilers, and your scoped_ptr example is nonsense.
See also <sys/queue.h>.
His example code is very dubious as it looks like C-with-classes rather than C++, mostly due to the lack of RAII.
Intrusive lists are still worth knowing and using, it's just that the author's reasoning was terrible. I found the Boost.Intrusive page to be much more knowledgeable: http://www.boost.org/doc/libs/1_35_0/doc/html/intrusive/intr...
He's a game programmer. He's keeping pointers to instances because these entities are part of an inheritance hierarchy[0]. That level of indirection is essential to his problem and completely unavoidable. On the other hand, std::list<std::unique_ptr<foo>> wouldn't change anything at all about the particularly inefficient memory arrangement of std::list<foo* >; std::unique_ptr is, after all, just a class with one pointer member.
When smart people write things you think are obviously dumb, you should invest some additional time trying to understand the context of their statements before writing them off. You'll learn more, and you won't come off as a flippant junior developer.
[0] http://www.codeofhonor.com/blog/tough-times-on-the-road-to-s...
The reasoning and requirements for him using intrusive lists was quite clear to me. The remaining article that starts at "Using intrusive lists" I found quite interesting and insightful.
My entire argument is against his evidence backing up statements like, "If you’re a programmer who uses the C++ STL std::list, you’re doing it wrong." It's unfair to improperly use std::list, compare it to a better solution, and then make a sweeping generalization. That's just outrageous.
The main things to take away from the blogpost were, I think, the concept of intrusive lists, to think about what objects look like in memory, and how you can construct code so that an object can be part of multiple containers and can remove itself from all of them when it's destroyed. Those are certainly concepts worth talking about, and so I think the blogpost was pretty great.
As for std::list... it's useless.
- "this guy" is actually a highly experienced and competent programmer.
- He programs games, with often very specific domain requirements and challenges. Try to understand context before making sweeping generalizations.
I understand game programming is a very different enterprise with very different tradeoffs and motivations, but I don't feel well-sold for "pedestrian" application development.
In particular, this isn't something "wrong" with std::list. He has a situation where he wants an object to manage its own membership in a mutable container. He says you can't do this efficiently with a simple non-intrusive linked list. He is right. You also can't do it efficiently with an array (std::vector, in this context).
You can do it efficiently with an intrusive linked list, as he points out. You can also use a non-intrusive linked list in which each object holds an iterator to itself. Or you can use an associative structure (std::map, std::unordered_map), in which each object holds its own key.
The instrusive linked list solution is going to have the fastest container insert & delete operations of all of these. But that doesn't mean it is the best solution for every circumstance.
Another point to be made, which he kinda-sorta gets at, is that it is a good idea to know how to code a linked list. The bulk of data structure decisions are just figuring out what already written package to use. But there is definitely still a place for a custom, application-specific linked list, and these are not difficult to write.
Also, it doesn't appear the provided code gracefully handles removing an item from a list twice.
I feel like this kind of problem is actually a specific instance of a much larger problem - that being explicit about these things leads to more maintainable code, at the expense of some 'leet things' that can cause troubles if not implemented perfectly.
m_prevLink->m_nextNode = m_nextNode;
I don't see where m_prevLink is changed. If the previous link has gone away, and you call RemoveFromList on this node a second time, it's going to chase that pointer.(A discussion of Linux data structures, including lists).
The first two examples don't have this problem though.
linked lists are seldom the right answer, contiguous blocks of memory are cache - and therefore algorithm - friendly.
the stl performance is usually quite easy to beat in special cases as well (i.e. all game code). some stls are terrible as well - the ms one is riddled with locks and all sorts of sledgehammer thread safety measures, which you just don't need if you know which bits of code are threaded and which arent.
A vector of pointers gives no contiguity advantage over an intrusive list during traversal (or any other operation).
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n227...
And a github repository that is still maintained:
I think the reason for dropping centuries from dates was not to gain speed, but to save two bytes. In the 70's of previous millennium, two bytes of storage would cost a lot more than today, and also space on punch cards was limited.