Too dangerous for C++ (opens in new tab)

(blog.dureuill.net)

93 pointsdureuill2y ago86 comments

86 comments

60 comments · 11 top-level

kazinator2y ago· 10 in thread

> the Rc type does not support being sent between threads

So why even have such a thing in a language designed for concurrent programming from the ground up?

Arc should be called Rc, and that's it.

MaxRegret2y ago

That's what C++ does because it has no way to ensure that you use the atomic reference counts in multi-threaded code. But, as the author writes in the blog post, Rust can in fact ensure this. So it lets you to use the more efficient non-atomic reference count for single-threaded use, saving the unnecessary cost of various memory access barriers.

Just because a language is designed for concurrent programming, it shouldn't make it impossible to achieve full single-threaded performance, as long as you're not compromising safety.

PaulDavisThe1st2y ago

this is a bit weird.

if you have only 1 thread, you don't atomic, and thus not using atomic reference counts is fine

but if you have more than 1 thread, you can't use a non-atomic refcount, so you can't use Rc but must use Arc.

"but that's such a simple change, just change the decl with a 1 char addition! Pluse, Rust won't let you do bad stuff if you've forgotten to change the type".

I guess I'm just old. Old enough that I've already implemented all the data structures and methods I need in C++, including safely passing around shared_ptr<T>.

Rusky2y ago

In Rust, you may have multiple threads yet still use non-atomic reference counts for objects that are never shared between multiple threads.

2 more replies

FreeFull2y ago

Not every program that's written will even be multithreaded, and there's a significant cost to using atomic operations when you don't need them. What is the disadvantage of having non-atomic Rc be available?

skitter2y ago

And Rc can still be useful for multithreaded programs. Not every value needs to be shared between threads.

LegionMammal9782y ago

Why not have such a thing, when it is strictly more performant for any program that has data which isn't accessed concurrently?

kazinator2y ago

Because then you need to complicate the compiler with a diagnostic against misuse, which has to work 100% right in all situations and be maintained forever.

tialaramex2y ago

Because Rust has a safety culture, and provides threading, it is crucial that that the compiler will reject types you cannot safely send to another thread. So it does.

So the "diagnostic against misuse" you're concerned about is a necessary part of the compiler anyway.

Indeed, although Rc has this line:

  impl<T: ?Sized, A: Allocator> !Send for Rc<T, A> {}

(which means roughly "You can't send this type to another thread")

It also has these lines:

  // Note that this negative impl isn't strictly necessary for correctness,
  // as `Rc` transitively contains a `Cell`, which is itself `!Sync`.

LegionMammal9782y ago

It's not just Arc<T> vs. Rc<T> that's relevant for thread safety, though. Pretty much any kind of shared mutability requires extra protection (locks or atomicity) to work safely across threads, so there has to be some way to indicate whether or not that extra protection is present. Not to mention objects that interact with FFI, such as mutex locks, which must be unlocked from the same thread. It would be a huge performance drain to demand that "every value everywhere must be usable from every thread".

MarkSweep2y ago

See the docs for Rc, including an example:

https://doc.rust-lang.org/std/rc/index.html#examples

vasilipupkin2y ago· 9 in thread

I am not going to be surprised to be downvoted, but you don't need shared_ptr in C++, that is itself overkill

The point of C++ is performance. If you don't need performance, why not just use Java or Python, why use Rust?

saghm2y ago

> The point of C++ is performance. If you don't need performance, why not just use Java or Python, why use Rust?

As a counterpoint, I also don't need to use `Rc` or `Arc` in Rust, and I can get by without reference counting. Why use C++?

brigadier1322y ago

You don't need performance until you do. Writing slow rust in my experience is just as easy as writing python. If not easier because the libraries are better designed.

bluGill2y ago

The point of rust is you get the performance of C++ with additional safety.

Const-me2y ago

For example, safe rust forbids code which writes to different elements of the same vector from different CPU cores. C++ compiler has no objections, and doing that is often the best way to parallelize computations.

glandium2y ago

> safe rust forbids code which writes to different elements of the same vector from different CPU cores.

It doesn't (see e.g. slice::split_at_mut). It however doesn't allow you to do that and mutate the vector at the same time.

1 more reply

tomas7892y ago

I wouldn’t mind using unsafe rust for encapsulated cases where it makes sense (though this might not be the one as it could be done in safe rust). It is just nice to have this delineation and it can simplify debugging as well.

1 more reply

nickysielicki2y ago

The point of a program is first and foremost to be correct, performance is never more important than that. /dev/null is not in fact webscale.

shikon72y ago

You use Rust if you want performance (especially because there is no garbage collection), and strong safety guarantees.

If you don’t care about safety guarantees and abstractions like shared_ptr, you might just as well use C instead of C++.

dureuillOP2y ago

"performance" is about profiling and avoiding bottlenecks.

That I'm using reference counting on the error path of my parser (so there's something wrong with the input and the task is not going to complete) is very unlikely to become a bottleneck.

You may have a point in that I navigated codebases that were plagued with shared ptrs everywhere in the past, and that general style of programming is not going to yield good performance. But you shouldn't deal in absolutes.

stathibus2y ago· 8 in thread

There are valid selling points to rust's safety features, but this just feels like "I use rust because I need my compiler to be my training wheels". More of a self-own than anything.

dureuillOP2y ago

This strikes me as a weird criticism.

If I hired someone to paint my wall and they were saying "I'm not going to use any protection against splatters on the ground because I am that good and don't need training wheels", I would find that behavior very unprofessional and wouldn't want that person anywhere close to my wall.

I'm writing professional software I'll take any help from tooling that is available without compromising other aspects like performance. It also helps that Rust is much more productive than C++ overall.

About the self-own, my teams over the years lauded my low bug rate, be it in C++ or in Rust. I have a knack for correctness, hence why I prefer languages that make strong guarantees about it by construction to languages where I need to remember and regurgitate thousands of rules at every corner

TheRoque2y ago

I don't see what's wrong with having training wheels. In fact, it's not really training wheels, it's just safeguards, and we all need them as much as possible (with a good balance between this and usability of course)

bestouff2y ago

Seeing the astounding number of CVEs in C++ code everywhere, everybody needs training wheels.

aprogr2y ago

My explanation ? There are way more program written in C/C++ than Rust out there, so statistically more bug are found. About the wheels ? Maybe that's true also for Rust developers: https://www.cvedetails.com/vulnerability-list/vendor_id-1902...

1 more reply

techbrovanguard2y ago

No true Scotsman would use a compiler either, you should be writing programs in binary. Why, only a fool would use a computer to automate repetitive, error-prone tasks!

Dylan168072y ago

> More of a self-own than anything.

Only if there's a lot of programmers that don't need them.

There's roughly zero of those programmers. And you're not one of them.

bfrog2y ago

I heard for nearly two decades how type safety, const correctness, etc proved to create better programs in C++ than C. Now Rust could be viewed as the next iteration of that and it seems like many C++ developers like to try and paint this picture that the added type safety and correctness checks are somehow terrible. The mental gymnastics at play are a bit mind boggling really.

adrianN2y ago

Real programmers use butterflies.

kevr2d22y ago· 6 in thread

Pretty clickbaitey title.

It's possible to implement in C++... so it's not "too dangerous" for C++. It's dangerous for people who don't have knowledge of what they're doing in C++; same as in any programming language.

eslaught2y ago

On the contrary, I thought it was quite apt. If you follow the article's link to this Stack Overflow answer:

https://stackoverflow.com/a/15140227/1614219

Which summarizes a discussion by the C++ standards committee to reject the C++ version of Rc, and one of the main arguments is the risk of Rc code being accidentally included in threaded code.

I would point out that this code can include: code you wrote years ago that you forgot includes Rc, code in libraries that was modified internally to use Rc and the authors forgot to mention it, code written by colleagues who aren't familiar with the pitfalls, etc. That's why this isn't a trivial problem to solve.

tialaramex2y ago

"Too dangerous" doesn't imply impossible.

It's too dangerous to parachute off the Eiffel tower. That doesn't mean it's impossible, periodically somebody does it.

xmcqdpt22y ago

I'm not a C++ expert but I believe it is not possible in current C++ to implement a pointer type that will cause a compiler error when it is sent to another thread.

jandrewrogers2y ago

Interesting thought experiment. While I haven’t tried it, one half-baked idea that comes to mind is disable both copy and move for the pointer and instantiate the type in C++ thread-local storage. Not that I ever would.

1 more reply

PaulDavisThe1st2y ago

That's almost certainly true, but first you have to define "send to another thread"

38362936482y ago

std::jthread?

PaulDavisThe1st2y ago· 5 in thread

i.e. a single-threaded non-atomic shared_ptr

Rust fans can dislike on the "C++ has no central library system like crates" all they want, but there's not many things you actually need when programming that don't exist for C++, even if you don't like them not coming in a little box that looks like other little boxes.

dureuillOP2y ago

The point of the article isn't that single-threaded non-atomic shared pointers are unfeasible in C++, it is that their usage is too dangerous.

The fact that it wasn't included in the standard library for this reason is an argument for this. The fact that even `shared_ptr` has thread-safety footguns, one of which made it to the famous C++ talk, “Curiously Recurring C++ Bugs at Facebook”[1], is another. By the way, every single of the bugs from that talk is impossible in safe Rust.

[1]: https://youtu.be/lkgszkPnV8g?si=cCWASihvIGJ25Jf3

0xfaded2y ago

While we're at it, std::atomic<std::shared_ptr<T>>

https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic2

dureuillOP2y ago

Yes, this is discussed in the article, I do not understand your point.

FpUser2y ago

This. As soon as I need something it is quick online search away. Amount of stuff available for C++ is staggering

nevi-me2y ago

The criticism is less about what's available, for there is more available in C++ than Rust. The criticism is about ease of packaging in a cross-platform magnet that is easy enough.

2 more replies

bsdpufferfish2y ago· 4 in thread

Shared objects interacting across threads is a bad idea. Java did it “safely” forever ago, it just throws a lock on everything

cyber_kinetist2y ago

Data sharing between threads is inherently too much of a complex model for programmers to manage (when systems get complicated enough), that in many cases it is better to think of a solution that avoids it altogether. This is why some concurrency-centric languages (Erlang, Go) choosed to use message passing as the main paradigm, instead of going for locks everywhere at runtime (Java) or an incredibly complex type system that tries to prevents data races at compile time (Rust)...

pornel2y ago

Rust's type system for thread safety is actually remarkably simple. Types declare whether they add or remove thread safety (e.g. Mutex adds safety, non-atomic Rc removes). Structs automatically become non-thread-safe if they have non-thread-safe fields. Then all the functions that spawn threads or send data over channels require thread-safe types.

The fearless concurrency is real. It reliably prevents data races, use-after-free, and moving of thread-specific data to another thread. It works across arbitrarily large and complex call graphs, including 3rd party dependencies and dynamic callbacks. Plus immutability is strongly enforced, and global mutable state without synchronization is not allowed.

It doesn't prevent deadlocks, but compared to data corruption heisenbugs, these are pretty easy — attach a debugger and you can see exactly what deadlocked where.

bsdpufferfish2y ago

> global mutable state without synchronization is not allowed.

That's the java model again. I don't want fearless concurrency, I want intentionally designed threads.

1 more reply

pkolaczk2y ago

Message passing is not any easier or safer though. For every problem with shared memory concurrency you can draw a dual problem in message passing. See: https://songlh.github.io/paper/go-study.pdf

For me, single-threaded intra-task concurrency using async/await turned out to be safer and also easier to work with than either of the above mentioned concurrency models. Just a single loop with a top level select - everything is sequential and easy to reason about, also no need for any synchronization like locks or shared atomic pointers.

cjensen2y ago· 2 in thread

The two criticism at the end are... odd.

First, there is criticism that assigning to a shared_ptr is not synchronized so it would be bad to share a single shared_ptr object between threads. True, but that is no different than literally every other non-atomic object in C++. It's not surprising in any way.

Second, there is criticism that assigning to the object pointed at by the shared_ptr is not synchronized between threads. This is odd because that's not actually different than a single thread where there are two shared_ptrs pointing to the same object. That is, even with single threading you have a problem you must be careful about.

rst2y ago

But if the Rust versions are as safe as claimed, then you're making the critique of C++ stronger, by pointing out that the pitfalls are easier to fall into than the blog post presents -- for the second, you don't even need threads! (And aliasing is one of the things that Rust's borrow machinery at least tries to address, even in a single-threaded context.)

So, is he wrong about the Rust part?

dureuillOP2y ago

In this context, "unsynchronized access" refers to read/write operations happening concurrently on multiple threads, *not* to the shared pointers pointing to different objects as a result of the assignment.

Unsynchronized access to the pointed to object will typically cause a specific kind of race condition called a data race, which is undefined behaviour. As it requires threads, it cannot happen in a single-threaded context.

ok1234562y ago· 2 in thread

> Apparently, this is enough of an issue that C++20 added a partial template specialization to std::atomic<std::shared_ptr>. My advice, though, would be "don't do that!". Instead, keep your shared pointer in a single thread, and send copies to other threads as needed.

This is to support an atomic lock-free shared_ptr. You can then use this as a building block for building lock-free data structures.

dureuillOP2y ago

Interesting, I was lacking this context. Could you provide me with more information about this? I only saw atomic_shared_ptr come up in discussions about bugs up to now.

ok1234562y ago

https://www.youtube.com/watch?v=gTpubZ8N0no

The target of this optimization is low-latency code. Rendezvous will not work for that.

PaulDavisThe1st2y ago· 2 in thread

From the stackoverflow link within TFA:

> With GCC when your program doesn't use multiple threads shared_ptr doesn't use atomic ops for the refcount. This is done by updating the reference counts via wrapper functions that detect whether the program is multithreaded (on GNU/Linux this is done by checking a special variable in Glibc that says if the program is single-threaded[1]) and dispatch to atomic or non-atomic operations accordingly.

> I realised many years ago that because GCC's shared_ptr<T> is implemented in terms of a __shared_ptr<T, _LockPolicy> base class, it's possible to use the base class with the single-threaded locking policy even in multithreaded code, by explicitly using __shared_ptr<T, __gnu_cxx::_S_single>. You can use an alias template like this to define a shared pointer type that is not thread-safe, but is slightly faster[2]:

dureuillOP2y ago

I would rather use the non-atomic shared pointer from Boost[1] linked upthread than a non-standard non-portable implementation detail from GCC, but yes, it exists.

You can definitely implement a non-atomic non-threadsafe shared pointer in C++, my point in the article is that actually using it is very error prone. This is supported by the type being excluded from the standard library with one of the reasons being the risk of bugs.

[1]: https://www.boost.org/doc/libs/1_65_0/libs/smart_ptr/doc/htm...

PaulDavisThe1st2y ago

The extent of the error prone-ness depends entirely on what "send to another thread" means (i.e. precisely how this is done).

reflexe2y ago· 1 in thread

From my experience, the biggest footgun with shared_ptr and multi threading is actually destruction.

It is very hard to understand which thread will call the destructor (which is by definition a non-thread-safe operation), and whether a lambda is currently holding a reference to the object, or its members. Different runs result different threads calling the destructor, which is very painful to predict and debug.

I think that rust suffers from the same issue, but maybe it is less relevant as it is a lot harder to cause thread safety issues there.

dureuillOP2y ago

> which is by definition a non-thread-safe operation

yes, but at this point, since the reference count is reaching 0, there is supposed to be only that one thread accessing the object being destroyed, so the destruction not being thread-safe should not be a problem.

If otherwise, it means there was a prior memory error where a reference to the pointed-to object escaped the shared_ptr. From there the code is busted anyway. By the way it cannot happen in Rust.

> Different runs result different threads calling the destructor

What adverse effects can happen there? I can think of performance impact, if a busy thread terminates the object, or if there is a pattern of always offloading termination to the same thread (or both of these situations happening at once). I can think of potential deadlocks, if a thread holding a lock must take the same lock to destroy the object (unlikely to happen in Rust where the Arc object would typically contain the object wrapped in its mutex and the mutex wouldn't be reused for locking other parts of the code). There isn't much else I can think of, what do you have in mind?

> whether a lambda is currently holding a reference to the object, or its members

This cannot happen in Rust. If a lambda is holding a reference to the object, then it either has (a clone of) the Arc, or is a scoped lambda to a borrow of an Arc.

flohofwoe2y ago

Refcounted memory management on a large scale is slow anyway, with or without atomic refcounting. The bigger problem is that Rc, Arc or shared_ptr often only manage one small object, and that object lives in a separate tiny heap allocation. So you end up with many tiny heap allocations spread more or less randomly around in memory and the likelyhood of getting cache misses on access is much highter than tightly packing the underlying data into arrays and walking over the array items in order.

And if you only have a small number of refcounted references in your program, the small performance difference between atomic and non-atomic refcounting doesn't matter either.

Same problem with Box and unique_ptr btw, a handful is ok, but once that number grows into the thousands all over the codebase it's hard to do any meaningful optimization (or even figure out how much performance you're actually losing to cache misses because it's a death-by-a-thousand-cuts scenario).

j / k navigate · click thread line to collapse

86 comments

60 comments · 11 top-level

kazinator2y ago· 10 in thread

> the Rc type does not support being sent between threads

So why even have such a thing in a language designed for concurrent programming from the ground up?

Arc should be called Rc, and that's it.

MaxRegret2y ago

Just because a language is designed for concurrent programming, it shouldn't make it impossible to achieve full single-threaded performance, as long as you're not compromising safety.

PaulDavisThe1st2y ago

this is a bit weird.

if you have only 1 thread, you don't atomic, and thus not using atomic reference counts is fine

but if you have more than 1 thread, you can't use a non-atomic refcount, so you can't use Rc but must use Arc.

"but that's such a simple change, just change the decl with a 1 char addition! Pluse, Rust won't let you do bad stuff if you've forgotten to change the type".

I guess I'm just old. Old enough that I've already implemented all the data structures and methods I need in C++, including safely passing around shared_ptr<T>.

Rusky2y ago

In Rust, you may have multiple threads yet still use non-atomic reference counts for objects that are never shared between multiple threads.

2 more replies

FreeFull2y ago

skitter2y ago

And Rc can still be useful for multithreaded programs. Not every value needs to be shared between threads.

LegionMammal9782y ago

Why not have such a thing, when it is strictly more performant for any program that has data which isn't accessed concurrently?

kazinator2y ago

Because then you need to complicate the compiler with a diagnostic against misuse, which has to work 100% right in all situations and be maintained forever.

tialaramex2y ago

Because Rust has a safety culture, and provides threading, it is crucial that that the compiler will reject types you cannot safely send to another thread. So it does.

So the "diagnostic against misuse" you're concerned about is a necessary part of the compiler anyway.

Indeed, although Rc has this line:

  impl<T: ?Sized, A: Allocator> !Send for Rc<T, A> {}

(which means roughly "You can't send this type to another thread")

It also has these lines:

  // Note that this negative impl isn't strictly necessary for correctness,
  // as `Rc` transitively contains a `Cell`, which is itself `!Sync`.

LegionMammal9782y ago

MarkSweep2y ago

See the docs for Rc, including an example:

https://doc.rust-lang.org/std/rc/index.html#examples

vasilipupkin2y ago· 9 in thread

I am not going to be surprised to be downvoted, but you don't need shared_ptr in C++, that is itself overkill

The point of C++ is performance. If you don't need performance, why not just use Java or Python, why use Rust?

saghm2y ago

> The point of C++ is performance. If you don't need performance, why not just use Java or Python, why use Rust?

As a counterpoint, I also don't need to use `Rc` or `Arc` in Rust, and I can get by without reference counting. Why use C++?

brigadier1322y ago

You don't need performance until you do. Writing slow rust in my experience is just as easy as writing python. If not easier because the libraries are better designed.

bluGill2y ago

The point of rust is you get the performance of C++ with additional safety.

Const-me2y ago

glandium2y ago

> safe rust forbids code which writes to different elements of the same vector from different CPU cores.

It doesn't (see e.g. slice::split_at_mut). It however doesn't allow you to do that and mutate the vector at the same time.

1 more reply

tomas7892y ago

1 more reply

nickysielicki2y ago

The point of a program is first and foremost to be correct, performance is never more important than that. /dev/null is not in fact webscale.

shikon72y ago

You use Rust if you want performance (especially because there is no garbage collection), and strong safety guarantees.

If you don’t care about safety guarantees and abstractions like shared_ptr, you might just as well use C instead of C++.

dureuillOP2y ago

"performance" is about profiling and avoiding bottlenecks.

That I'm using reference counting on the error path of my parser (so there's something wrong with the input and the task is not going to complete) is very unlikely to become a bottleneck.

stathibus2y ago· 8 in thread

There are valid selling points to rust's safety features, but this just feels like "I use rust because I need my compiler to be my training wheels". More of a self-own than anything.

dureuillOP2y ago

This strikes me as a weird criticism.

TheRoque2y ago

bestouff2y ago

Seeing the astounding number of CVEs in C++ code everywhere, everybody needs training wheels.

aprogr2y ago

1 more reply

techbrovanguard2y ago

No true Scotsman would use a compiler either, you should be writing programs in binary. Why, only a fool would use a computer to automate repetitive, error-prone tasks!

Dylan168072y ago

> More of a self-own than anything.

Only if there's a lot of programmers that don't need them.

There's roughly zero of those programmers. And you're not one of them.

bfrog2y ago

adrianN2y ago

Real programmers use butterflies.

kevr2d22y ago· 6 in thread

Pretty clickbaitey title.

It's possible to implement in C++... so it's not "too dangerous" for C++. It's dangerous for people who don't have knowledge of what they're doing in C++; same as in any programming language.

eslaught2y ago

On the contrary, I thought it was quite apt. If you follow the article's link to this Stack Overflow answer:

https://stackoverflow.com/a/15140227/1614219

Which summarizes a discussion by the C++ standards committee to reject the C++ version of Rc, and one of the main arguments is the risk of Rc code being accidentally included in threaded code.

tialaramex2y ago

"Too dangerous" doesn't imply impossible.

It's too dangerous to parachute off the Eiffel tower. That doesn't mean it's impossible, periodically somebody does it.

xmcqdpt22y ago

I'm not a C++ expert but I believe it is not possible in current C++ to implement a pointer type that will cause a compiler error when it is sent to another thread.

jandrewrogers2y ago

1 more reply

PaulDavisThe1st2y ago

That's almost certainly true, but first you have to define "send to another thread"

38362936482y ago

std::jthread?

PaulDavisThe1st2y ago· 5 in thread