Devirtualization and Static Polymorphism (opens in new tab)

(david.alvarezrosa.com)

53 pointsdalvrosa28d ago42 comments

42 comments

> Under the hood, a virtual table (vtable) is created for each class, and a pointer (vptr) to the vtable is added to each instance.

Coming from C++ I assumed this was the only way but Rust has an interesting approach where the single objects do not pay any cost because virtual dispatch is handled by fat pointers. So you carry around the `vptr` in fat pointers (`&dyn MyTrait`) only when needed, not in every instance.

cataphract28d ago

There have been type-erasure libraries in c++ for a longish time that allow choosing inline vtables and inline storage. It's definitely been a widely talked about technique for at least 10 years (I see talks about Dyno from 2017).

gignico17d ago

Well, Rust as well has been around for more than 10 years now. I don't imply Rust invented the approach. Surely academia knew about it decades before. I was rather commenting on how one's mental model of things can change by learning new languages.

anon29128d ago

This is the standard type class approach. Haskell does the same thing.

menaerus28d ago

> only when needed

Do you know how is this exactly deduced?

dminik28d ago

It's not. The user has to decide.

A specific type/reference to a type will always use static dispatch.

fn foo(bar: &Baz) { bar.thing(); }

A dyn trait reference will always use dynamic dispatch and carry around the vtable pointer.

fn foo(bar: &dyn BazTrait) { bar.thing(); }

1 more reply

dalvrosaOP28d ago

Good point, thanks for sharing!

hinkley28d ago

I wonder if I still have the link.

One of the papers I had bookmarked when toying with my own language design was someone that had worked out how to make interfaces as fast or faster than vtables by using perfect hashing and using the vtable as a hash table instead of a list.

You can also, when inlining a polymorphic call, put a conditional block in that bounces back to full dispatch if the call occasionally doesn’t match the common case. The problem with polymorphic inlining though is that it quickly resembles the exact sort of code we delete and replace with polymorphic dispatch:

    if (typeof arg1 == “string”) {
    } else if typeof arg1 === …) {
    } else if {
    } else if {
    } else {
    }

akoboldfrying28d ago

> using the vtable as a hash table instead of a list.

Could you explain this a bit more? The word "list" makes me think you might be thinking that virtual method lookup iterates over each element of the vtable, doing comparisons until it finds a match -- but I'm certain that this is not how virtual method invocation works in C++. The vtable is constructed at compile time and is already the simplest possible "perfect hashtable": a short, dense array with each virtual method mapping to a function pointer at a statically known index.

hinkley28d ago

The problem they were trying to solve was multiple inheritance, and by nominal type not by code reuse. So interfaces, basically.

So these guys essentially assigned a hashcode to every function of every interface and then you would do dispatch instead of obj.vtable[12] you would do modular math x = singature.hash % len(obj.vtable) and call that.

I believe this was sometime around 2005-2008 and they found that it was fast enough on hardware of that era to be usable.

1 more reply

corysama28d ago

"list" here does not refer to a "linked list". In more academic circles, a "list" referes to any linear container. Such as a Python List. In practice, C++ vtables are effectively structs containing function pointers.

dalvrosaOP28d ago

Nice one, TIL

One caveat with "hash vtables" is that you only really see a performance win when the interface has a lot of specializations.

hinkley28d ago

As I just mentioned in another reply, the problem they were trying to solve was hierarchies where it makes sense for a group of types to be constructed by the combination of two or three narrowly scoped interfaces.

For instance, if you treat some collections as read only, you can define comprehensions across them with a single implementation. But that means the mutators have to be contained in another type, which a subset will implement, and may have covariant inputs.

anon29128d ago

That's because that type of code is actually better performing than the dynamic dispatch.

There's absolutely nothing wrong with this code. It's just that it's not as extensible

It's a 'closed world' representation where the code assumes it knows about every possibility. This make extension more difficult

The code itself is extraordinarily good and performant.

171862744027d ago

As someone who's favorite language is C, I don't see what is wrong with that code? Sure, you need to extend it with a new subtype, but you also need to implement every virtual function anyway. And if you use switch instead of an if-else-chain the compiler will complain when you are missing a subtype.

AnimalMuppet27d ago

What's wrong with it is, when I extend with a new subtype, I have to fix up the locations that use the type. Potentially all of the locations that use it - I at least have to look at all of them.

With the polymorphic approach, I just have to create the new subtype, and all the users can do the right thing (if they were written with polymorphism in mind, anyway - if they use virtual functions on the base class).

1 more reply

anitil28d ago

I've been thinking through what features I'd want in a language if I were designing one myself, and one of my desires is to have exhaustive matches on enums (which could be made of any primitive type) and sum types. The ability to generate perfect hashes at compile time was one of the things that falls out nicely from that

drmikeando28d ago

While this is a great article, I feels it buries the lede.

For me, the key insight was from the last paragraph of the article:

C++23 introduces "deducing this", which is a way to avoid the performance cost of dynamic dispatch without needing to use tricks like CRRT, by writing:

    class Base {
    public:
      auto foo(this auto&& self) -> int { return 77 + self.bar(); }
    };

    class Derived : public Base {
    public:
      auto bar() -> int { return 88; }
    };

I wish the article had gone into more details on how this works and when you can use it, and what its limitations are.

dalvrosaOP27d ago

Thanks for the feedback, I'll consider expanding in a separate post

pjmlp28d ago

Nice overview, it misses other kinds of dispatch though.

With concepts, templates and compile time execution, there is no need for CRTP, and in addition it can cover for better error messages regarding what methods to dispatch to.

dalvrosaOP28d ago

Fair. New C++ standards are providing great tools for compile-time everything

But still CRTP is widely used in low-latency environments :)

Panzerschrek28d ago

Since std::variant was introduced I use inheritance and virtual calls much less than before. It's faster, since variant dispatch (via std::visit) is basically a switch statement with all execution paths visible to the compiler and thus inlining is possible. Inheritance and virtual calls are nowadays only necessary in places where it's not possible to statically list all alternatives (where the set of derived classes is open).

dalvrosaOP27d ago

Yeah for C++17 or above, it's a nicer and more performant alternative in most cases

TimorousBestie28d ago

Good article, rare to see simple explanations of intricate C++ ideas.

dalvrosaOP28d ago

Thank you :)

Archit3ch27d ago

An expressive combination is Static Polymorphism + Multiple Dispatch, which Julia resorts to when it can.

yunnpp28d ago

Crazy web design, by the way. Diggin' it very much.

dalvrosaOP27d ago

j / k navigate · click thread line to collapse

42 comments

gignico28d ago

> Under the hood, a virtual table (vtable) is created for each class, and a pointer (vptr) to the vtable is added to each instance.

cataphract28d ago

gignico17d ago

anon29128d ago

This is the standard type class approach. Haskell does the same thing.

menaerus28d ago

> only when needed

Do you know how is this exactly deduced?

dminik28d ago

It's not. The user has to decide.

A specific type/reference to a type will always use static dispatch.

fn foo(bar: &Baz) { bar.thing(); }

A dyn trait reference will always use dynamic dispatch and carry around the vtable pointer.

fn foo(bar: &dyn BazTrait) { bar.thing(); }

1 more reply

dalvrosaOP28d ago

Good point, thanks for sharing!

hinkley28d ago

I wonder if I still have the link.

    if (typeof arg1 == “string”) {
    } else if typeof arg1 === …) {
    } else if {
    } else if {
    } else {
    }

akoboldfrying28d ago

> using the vtable as a hash table instead of a list.

hinkley28d ago

The problem they were trying to solve was multiple inheritance, and by nominal type not by code reuse. So interfaces, basically.

I believe this was sometime around 2005-2008 and they found that it was fast enough on hardware of that era to be usable.

1 more reply

corysama28d ago

dalvrosaOP28d ago

Nice one, TIL

One caveat with "hash vtables" is that you only really see a performance win when the interface has a lot of specializations.

hinkley28d ago

anon29128d ago

That's because that type of code is actually better performing than the dynamic dispatch.

There's absolutely nothing wrong with this code. It's just that it's not as extensible

It's a 'closed world' representation where the code assumes it knows about every possibility. This make extension more difficult

The code itself is extraordinarily good and performant.

171862744027d ago

AnimalMuppet27d ago

1 more reply

anitil28d ago

drmikeando28d ago

While this is a great article, I feels it buries the lede.

For me, the key insight was from the last paragraph of the article:

C++23 introduces "deducing this", which is a way to avoid the performance cost of dynamic dispatch without needing to use tricks like CRRT, by writing:

    class Base {
    public:
      auto foo(this auto&& self) -> int { return 77 + self.bar(); }
    };

    class Derived : public Base {
    public:
      auto bar() -> int { return 88; }
    };

I wish the article had gone into more details on how this works and when you can use it, and what its limitations are.

dalvrosaOP27d ago

Thanks for the feedback, I'll consider expanding in a separate post

pjmlp28d ago

Nice overview, it misses other kinds of dispatch though.

With concepts, templates and compile time execution, there is no need for CRTP, and in addition it can cover for better error messages regarding what methods to dispatch to.

dalvrosaOP28d ago

Fair. New C++ standards are providing great tools for compile-time everything

But still CRTP is widely used in low-latency environments :)

Panzerschrek28d ago

dalvrosaOP27d ago

Yeah for C++17 or above, it's a nicer and more performant alternative in most cases

TimorousBestie28d ago

Good article, rare to see simple explanations of intricate C++ ideas.

dalvrosaOP28d ago

Thank you :)

Archit3ch27d ago

An expressive combination is Static Polymorphism + Multiple Dispatch, which Julia resorts to when it can.

yunnpp28d ago

Crazy web design, by the way. Diggin' it very much.

dalvrosaOP27d ago

j / k navigate · click thread line to collapse