I recently had to work with C++ code and... it is not a happy story anymore:
- Lots of magic like described here
- F**ing templates - for those who like them, did you ever see a C++ core file? Or tried to understand a single symbol?
- Standarization that feels like pulling more Boost into the language, which means more templates. Which makes core files incomprehensible.
It used the be that average dev who knew C could read and work with simple C++. This is no longer true. C++ is no longer a better C.
P.S. Example from recent core file, one line in the stack trace: boost::asio::asio_handler_invoke<boost::asio::detail::binder2<core::AsyncSignalCatcher<2, 15, 17>::waitForSignal<XServer::m_accept_loop(tsr::Data&)::<lambda(spawn::yield_context)>::<lambda(int)> >(XServer::m_accept_loop(tsr::Data&)::<lambda(spawn::yield_context)>::<lambda(int)>&&)::<lambda(const boost::system::error_code&, int)>, boost::system::error_code, int> > (function=...)
It is possible, and perhaps even likely that the C++ you wrote in 1994 was indeed this "better C" and maybe even the C++ you read, the language Stroustrup wrote about in 1985 although "C with Objects" is much older still. The ISO document (C++ 98 aka ISO 14882:1998) was four years into your future in 1994, but the committee to write that document had existed for quite some time. Stroustrup's 1991 Second Edition of his appropriately already big book "The C++ Programming Language" explains that the committee have accepted Templates, although I believe at that point they didn't realise they'd inadvertently thus added an entirely new meta-language which is programmed differently than the rest of C++
Boost comes much later, it's only about as old as the actual ISO document.
Erwin Unruh demonstrated to the C++ commitee how they had accidentally created a second, compile time programming language inside the primary language. He did this by calculating primes and printing them in error messages.
People got carried away a bit with the power made available by this new language. Unfortunately there are no basic ergonomics in template metaprogramming because it was not intended to be a programming language.
templates are totally fine for basic containers. Much better than C macro shenanigans.
The world is still looking for a “better C”. Zig, Odin, Jai and probably more are all trying. Of the three I think Jai is the closest. But it’ll be awhile.
I have 20+ years of experience writing C++.
Yes, I've looked at "core" C++ headers and source. The most annoying part to me is style (mixed tabs and spaces, curly braces at wrong places, parenthesis or not, the usual style complaints). But other than that they're very readable to a seasoned C++ engineer.
I've also tried to understand symbols. You're right, they're difficult. But there's also tooling available to do it automatically. Even if you don't want to use the tools, there is a method to the madness and it's documented...
Let me ask ChatGPT:
> What tool lets me translate an exported symbol name to a C++ name?
C++filt
It's categorized as a demangler. That's your search term to look for (I had to remember what it was).Then I asked:
> Is there a function in the standard library which allows to mangle or demangle a given name or symbol?
It tells about `__cxa_demangle` for GCC. While I had forgotten about that, I'm pretty sure there is (or perhaps something similar) in the standard library.
It also suggests to use a library such as `abi::__cxa_demangle`. Hah, that's what I was looking for. It's an implementation-specific API (eg, compiler-specific) API used as an example. It was mentioned on `std::type_info::name()` page here:
https://en.cppreference.com/w/cpp/types/type_info/name
So, to continue replying to you: yes, it's annoying but it's solvable with tools that you can absolutely integrate into your IDE or command-line workflow.
> - Standarization that feels like pulling more Boost into the language, which means more templates.
The boost libraries are open source and their mailing lists are active. If you don't like a given library because it has too many templates then you could make one with fewer templates.
And, as standardization goes, it's also quite open source. The C++ committee is very open and receptive to improvements. The committee are volunteers (so their time is limited) and (usually) have their own improvements to the standard that they want. So you have to drive the changes you want (eg, actively seek feedback and engagement).
> P.S. Example from recent core file, one line in the stack trace:
I've seen much longer -- I've seen templates entirely fill a terminal buffer for a single line. That's extremely rare, definitely not fun, and debuggability is absolutely a valid reason to refactor the application design (or contribute library changes).
I find it useful to copy the template vomit into a temporary file and then run a formatter (eg clang-format), or search/replace `s/(<\({[)/\1\n/g` and manually indent. Then the compiled type is easier to read.
Some debuggers also understand type aliases. They'll replace the aliased type with the name you actually used, and then separately emit a message (eg, on another line) indicating the type alias definition (eg, so you can see it if you don't have a copy of the source)
The problem the article highlights appears to be an implementation defect: in my libstdc++ test just now, we do, in fact, mark the list as nothrow move constructible. The standard should mandate that std::list be infallibly moveable.
Are we going to indict a whole programming model based on an isolated implementation bug? If so, well, isn't doing that from a "C is better" perspective the galactic black hole calling the kettle black?
#include <cstdio>
#include <list>
#include <type_traits>
#include <vector>
struct Node;
struct Connection {
Node *from, *to;
};
struct Node {
std::vector<Connection> connections;
};
int
main()
{
printf("%d\n", std::is_nothrow_move_constructible_v<std::list<Node>>);
return 0;
}I would default to just using std::unique_ptr<Node> in a situation like this, especially since using std::list suggests performance isn't critical here, so the additional indirection probably doesn't matter.
In fact, full-on garbage collection wouldn't have prevented the bug here, and could arguably have made it worse. The problem is that nodes were unexpectedly being copied and then the originals deleted. With GC, you'd still have the copy, but never delete the originals, so you end up with split brain where there are multiple copies of each node and various pointers point to the wrong ones. That'd be pretty painful to debug!
IMO the language-level problem, if there is one, is that C++ is too willing to copy in cases where you would expect it to move instead. This is, of course, for backwards compatibility with the before-times when there was no move and copying was the only logical thing to do. But I think life would be better these days if all non-trivial copies had to be requested explicitly.
std::shared_ptr comes with std::weak_ptr. Referencing counting is rather ham-fisted approach but is certainly a solution.
> IMO the language-level problem, if there is one, is that C++ is too willing to copy in cases where you would expect it to move instead.
IMO that's not a problem in the language but a problem with the engineer (misunderstanding when std::move is necessary) and the tooling (linter/static analyzer not clearly identifying that something should be moved instead, and raising a linter warning for it).
For that matter, the places where I see std::list used aren't places where "performance isn't important" but rather places where an inexperienced engineer was put in charge of implementation and a senior engineer accepted it. I can't remember the last time I accepted someone using std::list in a code review because there has always been a better design available even if it necessitated some teaching. If a stable pointer address is needed then indeed a smart pointer is the correct solution (perhaps std::vector<std::unique_ptr>). There are other reasons I've had coworkers cite for using std::list (eg constant allocation time) but that's generally resolved with std::vector.reserve(upper bound to size) or eg a slab allocator (unfortunately, I'm not aware of a standard-provided slab allocator, though to be fair I'm not very familiar with C++ standard allocators in general).
> I think life would be better these days if all non-trivial copies had to be requested explicitly
While I don't agree superficially (smells like bringing along deep-copy problems), I think the idea merits some thought experiments.
It would be fairly trivial to do that for non-plain-old-data types by deleting the copy constructor/operator (so it cannot happen implicitly) and providing a `make_copy(...)` function instead.
... and yet, almost all vulnerabilities in C++ code are still written in C style, not even legacy C++.
They just got tripped up by an obscure language feature that made the Nodes move anyway.
Apparently the folks didn't learn their lesson with Web standards, given the power they gave Google to transform the Web into ChromeOS.
(Microsoft was doing this to C++ well before the early part of the millennium…)
Edit: and in non-joke fairness, Microsoft has really come a long way on this regard.
In reality, it's a defect in the standard, plus a quality-of-implementation issue in the Microsoft's C++ Standard Library.
struct Node {
std::vector<Connection> connections;
};struct Connection {
Node* from, to;
};Does this mean that to create the vector of connections, Nodes are created, and references are taken to store in the Connections? And then the Nodes are stored in the list, with std::move()?
I don't understand why you would want to go down that road. Intuitively, I would assume that you are not safe from an object copy somewhere down the line and your graph then comes crashing down like a house of cards. Wouldn't it make more sense to store the nodes as pointers? If you like to live dangerously, something like:
struct Graph {
std::vector<std::list<Node\*>> nodes;
};Or better:
struct Graph {
std::vector<std::list<std::unique_ptr<Node>>> nodes;
};The later will give you plenty of warnings if you do not copy Nodes around with std::move().
Or less performant, but maybe safer, std::shared_ptr<Node>, together with:
struct Connection {
std::weak_ptr<Node> from, to;
};so that you have some check guarantees before access?
> It may make more sense to create pointers here but that's a larger change to deal with and ensure correctness versus just swapping the vector out for another list. I don't claim to like that solution but it seems to me like legacy C++ code in general is fragile enough as it is,
That's more than fair enough. In such code, more often than not, you make what you think is a small change, and you end up with an entire cascade of unexpected side-effects that pop-up because the original assumptions and the testing scope of those assumptions were lost. A can of worms best left unopened.
By the way, wouldn't such an error be something that gets detected by the "new" ASAN functionality that has been added to the newer MSVC toolchain you are using?
If you find storing pointers to elements too perilous, you should probably just make a container of pointers instead.
The interesting part is why it worked in VS2013 but failed in VS2022.
I thought the explanation he came up with was interesting but such a change could have easily been brought about by changes to reallocation parameters in the implementation.
Note I’m not asking about move constructor vs memmove/cpy but rather the use of copy constructor vs move depending on exception behavior? Is it something like prefer no throw copy constructor over “throwing” move?
Anyway, the reason to use move instead of copy is for performance. Move constructors are faster because they can leave the source object modified (e.g., take over control of a pointer to deep contents). This falls apart when the move constructor can throw, because the container might be part way through a resize when this happens, leaving the object before the exception modified and the code in an unrecoverable state.
Basically, unless we can be super duper 100% certain we’re going to make it through the operation without throwing an exception, we’re going to copy, leaving the objects in question in an unaltered state, and holding to the promises of the standard.
However thanks for explaining the issue. This one is obvious and I just completely failed to think about how you ensure the source object is in a safe state if an exception occurs part way through moving the source data. It seems to imply the old MSVC behaviour was incorrect in such a scenario, but I hadn’t considered that possibility so assumed it was correct and therefore didn’t think of why this behaviour is required.
My solution is of course to simply not allow exceptions because the c++ model of everything implicitly throwing is just as annoying as Java’s “let’s explicitly annotate everything” model albeit with different paths to sadness.
It isn't likely to leave the code in an unrecoverable state even if recovery is calling std::terminate (or worse).
It is likely to leave the data in an unrecoverable state. Imagine that a vector of 4 items was resized -- the first two objects move successfully, but the third one throws an exception. Then in your move function, you catch that and decide to undo your changes before propagating the error. Then when you're undoing the changes the first object throws an exception when it's being moved back. Oof! At best you've got multiple active exceptions (legal if you're in the catch handler, but should be rare and definitely should be avoided) and at worst your data is indeed unrecoverable (thus one of many reasons why std::terminate is the default option when multiple exceptions are alive on the same stack).
Last century move didn't exist. The terrible C++ move (which is basically an actual "destructive move" plus a default create step) was invented for C++ 11 which was, as the name suggests, standardised only in 2011.
So back then everybody is using copy assignment semantics. Your compiler might be smart enough, especially for trivial cases, to spot the cheap way to deliver the required semantics, but it might not especially as things get tricky (e.g. a std::vector of std::list) and semantically it's definitely a copy, not a move.
As a result the "non-move" that you're astonished by is how all C++ code last century was written, the semantics you're just assuming as necessary didn't even exist in ISO C++ 98 and it is considered important that such code still works.
Why does std::vector<T> require T's move constructor to be noexcept (or else it falls back to copying instead)?
The reason goes something like this:
When std::vector<T> grows, it needs to move or copy all of its elements into a new, larger-capacity array. It would prefer to move them, since that's a lot more efficient than copying (for non-trivial types). But what happens if it moves N elements, and then the move constructor for element N+1 throws an exception? Elements 0-N have been moved away already, so the vector is no longer valid as-is. Should it try to move those elements back to the original array? But what if one of those moves fails?
The C++ standards body decided to sidestep this whole problem by saying that std::vector<T> will refuse to use T's move constructor unless it is declared noexcept, so the above problem can't happen.
In my opinion, this was a huge mistake. Intuitively, everyone expects that when an std::vector<T> grows, it's going to move the elements, not making a ton of copies. Often, these copies result in hidden performance problems. Arguably the author of this post is lucky than in their case, the copies resulted in outright failure, thus revealing the problem.
There seem to be two other possibilities:
* std::vector<T> could simply refuse to compile if the move constructor was not `noexcept`. I think this could have been done in a way that wouldn't have broken existing code, if it had been introduced before move constructors existed in the wild -- unfortunately, that ship has now sailed and this cannot be done now without breaking people.
* std::vector<T> could always use move constructors, even if they are not declared `noexcept`, and simply crash (std::terminate()) in the case that one actually throws. IMO this would be fine and is the best solution. Move constructors almost never actually throw in practice, regardless of whether they are declared as such, because move constructors are almost always just "copy pointer, null out the original". You don't put complex logic in your move constructor. And anyway, C++ already has plenty of precedent for turning poorly-timed exceptions into terminations; why not add another case? But I think it's unlikely the standards committee would change this now.
Move constructor and move operator will bind to an R-value reference if the move constructor or move operators are available. Conversely, if those functions which declare to not throw anything do end up throwing something then the result is std::terminate.
The only things that determine whether to use a move or copy is whether the reference is an R-value and whether the source or destination is const.
You can declare a {const R-value move operator (not a constructor) for the left-hand} and/or {const R-value move operator or constructor for the right-hand side} of the argument. But you won't be able to modify anything not marked mutable. You shouldn't do that though: that's a sure way to summon nasal demons. That said, I see it fairly often from less experience engineers, particularly when copy-pasting a copy operator intending to modify it for a move operator.
I have a hard time thinking of C++ and the STL as separate. Even our internal utilities and such tend to be STL-like although often with safer defaults.
There are still issues (the iterator API used by for(:) is very hard to make safe without terrible perf issues, though I was looking at this recently and the compilers are doing much better than they used to).
Things like unique_ptr and shared_ptr do not meaningfully improve the security of c++ despite being presented as if they did (all serious c++ projects already had smart pointers before the stl finally built them in so presenting them as a security improvement is disingenuous), and because of the desire to have shared_ptr be noninvasive it’s strictly worse than most other shared ownership smart pointers I’ve used.
The TLDR is: Using `move_if_noexcept` instead of plain old `move` can help you provide the "strong exception guarantee." For what _that_ is, see cppreference: https://en.cppreference.com/w/cpp/language/exceptions#Except...
and/or the paper by Dave Abrahams that introduced the term, "Exception-Safety in Generic Components: Lessons Learned from Specifying Exception-Safety for the C++ Standard Library." https://www.boost.org/community/exception_safety.html
> Is it something like prefer no throw copy constructor over “throwing” move?
Almost. If move won't throw (or if copy isn't possible), we'll move. But given a choice between a throwing move and any kind of copy, we'll prefer copy, because copy is non-destructive of the original data: if something goes wrong, we can roll back to the original data. If the original data's been moved-from, we can't.
Regarding your question, nothrow operations are essential to maintaining invariants. And maintaining invariants is how you make code correct in a world where UB exists.
The problem with this flavour of UB being a programmer's promise "this is fine, trust me, no run-time checks needed" to the compiler is that a) it's made by the programmer by omitting said run-time checks ― and that often happens accidentally, not intentionally; b) the compilers are really bad at pointing out to the programmer places where they took advantage of such promises, which really complicates the task of writing conforming programs. Every time I add two int's, I promise to the compiler that an overflow won't happen: and of course, the moment an UB happens, all invariants cease to hold, so trying to find the initial bug where you've accidentally broke one of invariants turns into a nightmare.