开发者

Is there any penalty/cost of virtual inheritance in C++, when calling non-virtual base method?

开发者 https://www.devze.com 2023-02-22 00:09 出处:网络
Does using virtual inheritance in C++ have a runtime pe开发者_StackOverflow社区nalty in compiled code, when we call a regular function member from its base class? Sample code:

Does using virtual inheritance in C++ have a runtime pe开发者_StackOverflow社区nalty in compiled code, when we call a regular function member from its base class? Sample code:

class A {
    public:
        void foo(void) {}
};
class B : virtual public A {};
class C : virtual public A {};
class D : public B, public C {};

// ...

D bar;
bar.foo ();


There may be, yes, if you call the member function via a pointer or reference and the compiler can't determine with absolute certainty what type of object that pointer or reference points or refers to. For example, consider:

void f(B* p) { p->foo(); }

void g()
{
    D bar;
    f(&bar);
}

Assuming the call to f is not inlined, the compiler needs to generate code to find the location of the A virtual base class subobject in order to call foo. Usually this lookup involves checking the vptr/vtable.

If the compiler knows the type of the object on which you are calling the function, though (as is the case in your example), there should be no overhead because the function call can be dispatched statically (at compile time). In your example, the dynamic type of bar is known to be D (it can't be anything else), so the offset of the virtual base class subobject A can be computed at compile time.


Yes, virtual inheritance has a run-time performance overhead. This is because the compiler, for any pointer/reference to object, cannot find it's sub-objects at compile-time. In constrast, for single inheritance, each sub-object is located at a static offset of the original object. Consider:

class A { ... };
class B : public A { ... }

The memory layout of B looks a little like this:

| B's stuff | A's stuff |

In this case, the compiler knows where A is. However, now consider the case of MVI.

class A { ... };
class B : public virtual A { ... };
class C : public virtual A { ... };
class D : public C, public B { ... };

B's memory layout:

| B's stuff | A's stuff |

C's memory layout:

| C's stuff | A's stuff |

But wait! When D is instantiated, it doesn't look like that.

| D's stuff | B's stuff | C's stuff | A's stuff |

Now, if you have a B*, if it really points to a B, then A is right next to the B- but if it points to a D, then in order to obtain A* you really need to skip over the C sub-object, and since any given B* could point to a B or a D dynamically at run-time, then you will need to alter the pointer dynamically. This, at the minimum, means that you will have to produce code to find that value by some means, as opposed to having the value baked-in at compile-time, which is what occurs for single inheritance.


At least in a typical implementation, virtual inheritance carries a (small!) penalty for (at least some) access to data members. In particular, you normally end up with an extra level of indirection to access the data members of the object from which you've derived virtually. This comes about because (at least in the normal case) two or more separate derived classes have not just the same base class, but the same base class object. To accomplish this, both of the derived classes have pointers to the same offset into the most derived object, and access those data members via that pointer.

Although it's technically not due to virtual inheritance, it's probably worth noting that there's a separate (again, small) penalty for multiple inheritance in general. In a typical implementation of single inheritance, you have a vtable pointer at some fixed offset in the object (quite often the very beginning). In the case of multiple inheritance, you obviously can't have two vtable pointers at the same offset, so you end up with a number of vtable pointers, each at a separate offset in the object.

IOW, the vtable pointer with single inheritance is normally just static_cast<vtable_ptr_t>(object_address), but with multiple inheritance you get static_cast<vtable_ptr_t>(object_address+offset).

Technically, the two are entirely separate -- but of course nearly the only use for virtual inheritance is in conjunction with multiple inheritance, so it's semi-relevant anyway.


Concretely in Microsoft Visual C++ there is an actual difference in pointer-to-member sizes. See #pragma pointers_to_members. As you can see in that listing - the most general method is "virtual inheritance" which is distinct from multiple inheritance which in turn is distinct from single inheritance.

That implies that more information is needed to resolve a pointer-to-member in the case of presence of virtual inheritance, and it will have a performance impact if only through the amount of data taken up in the CPU cache - though likely also in the length of the lookup of the member or the number of jumps needed.


I think, there is no runtime penalty for virtual inheritance. Don't confuse virtual inheritance with virtual functions. Both are two different things.

virtual inheritance ensures that you've only one sub-object A in instances of D. So I don't think there would be runtime penalty for it alone.

However, there can arise cases where this sub-object cannot be known at compile time, so in such cases there would runtime penalty for virtual inheritance. One such case is described by James in his answer.


Your question is focused mostly on calling regular functions of the virtual base, not the (far) more interesting case of virtual functions of the virtual base class (class A in your example)-- but yes, there can be a cost. Of course everything is compiler dependent.

When the compiler compiled A::foo, it assumed that "this" points to the start of where the data members for A resides in memory. At this time, the compiler might not know that class A will be a virtual base of any other class. But it happily generates the code.

Now, when the compiler compiles B, there won't really be a change because while A is a virtual base class, it is still single inheritance and in the typical case, the compiler will layout class B by placing class A's data members immediately followed by class B's data members-- so a B * can be immediately castable to a A * without any change in value, and hence, the no adjustments need to be made. The compiler can call A::foo using the same "this" pointer (even though it is of type B *) and there is no harm.

The same situation is for class C-- its still single inheritance, and the typical compiler will place A's data members immediately followed by C's data members so a C * can be immediately castable to an A * without any change in value. Thus, the compiler can simply call A::foo with the same "this" pointer (even though it is of type C*) and there is no harm.

However, the situation is totally different for class D. The layout of class D will typically be class A's data members, followed by class B's data members, followed by class C's data members, followed by class D's data members.

Using the typical layout, a D * can be immediately convertable to an A *, so there is no penalty for A::foo-- the compiler can call the same routine it generated for A::foo without any change to "this" and everything is fine.

However, the situation changes if the compiler needs to call a member function such as C::other_member_func, even if C::other_member_func is non-virtual. The reason is that when the compiler wrote the code for C::other_member_func, it assumed that the data layout referenced by the "this" pointer is A's data members immediately followed by C's data members. But that is not true for an instance of D. The compiler may need to rewrite and create a (non-virtual) D::other_member_func, just to take care of the class instance memory layout difference.

Note that this is a different but similar situation when using multiple inheritance, but in multiple inheritance without virtual bases, the compiler can take care of everything by simply adding a displacement or fixup to the "this" pointer to account for where a base class is "embedded" within an instance of a derived class. But with virtual bases, sometimes a function rewrite is needed. It all depends on what data members are accessed by the (even non-virtual) member function being called.

For example, if class C defined a non-virtual member function C::some_member_func, the compiler might need to write:

  1. C::some_member_func when called from an actual instance of C (and not D), as determined at compile time (because some_member_func isn't a virtual function)
  2. C::some_member_func when the same member function is called from an actual instance of class D, as determined at compile time. (Technically this routine is D::some_member_func. Even though the definition of this member function is implicit and identical to the source code of C::some_member_func, the generated object code may be slightly different.)

if the code for C::some_member_func happens to use member variables defined in both class A and class C.


There has to be a cost to virtual-inheritance.

The proof is that virtually inherited classes occupy more than the sum of the parts.

Typical case:

struct A{double a;};

struct B1 : virtual A{double b1;};
struct B2 : virtual A{double b2;};

struct C : virtual B1, virtual B2{double c;}; // I think these virtuals are not strictly necessary
static_assert( sizeof(A) == sizeof(double) ); // as expected

static_assert( sizeof(B1) > sizeof(A) + sizeof(double) ); // the equality holds for non-virtual inheritance
static_assert( sizeof(B2) > sizeof(A) + sizeof(double) );  // the equality holds for non-virtual inheritance
static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) );
static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) + sizeof(double));

(https://godbolt.org/z/zTcfoY)

What is stored additionally? I don't exactly understand. I think it is something like a virtual table but for accessing individual members.


There is a cost of additional memory. For example, GCC 7 on x86-64 gives following results:

#include <iostream>

class A { int a; };
class B: public A { int b; };
class C: public A { int c; };
class D: public B, public C { int d; };
class BV: virtual public A { int b; };
class CV: virtual public A { int c; };
class DV: public BV, public CV { int d; };


int main()
{
    std::cout << sizeof(A) << std::endl;
    std::cout << sizeof(B) << std::endl;
    std::cout << sizeof(C) << std::endl;
    std::cout << sizeof(D) << std::endl;
    std::cout << sizeof(BV) << std::endl;
    std::cout << sizeof(CV) << std::endl;
    std::cout << sizeof(DV) << std::endl;
    return 0;
}

This prints out:

4
8
8
20
16
16
40

As you can see, some extra bytes added when you use virtual inheritance.


Well, after many good answers explaining, while looking up the exact position of the virtual base class in memory incurs a performance penalty, there is a follow up question: "Can this penalty be reduced?" Fortunately, there is a partial solution in form of the (not yet mentioned) final keyword. In particular, calls from the class D of the original example to the innermost base A can usually be (almost) penalty-free, but in the general case only, if you finalize D.

For why this is necessary, let's look at a multilevel class hierarchy:

class Base {};

class ExtA : public virtual Base {};
class ExtB : public virtual Base {};
class ExtC : public virtual Base {};

class App1 : public Base {};
class App2 : public ExtA {};
class App3 : public ExtB, public ExtC {};

class SuperApp : public App2, public App3 {};

Because our Application classes can use various of the Extension classes of our base class, none of those Extension classes can know at compile time, where the Base subobject will be located within the object, that they are called with. Rather, they have to consult the virtual table at runtime to find out. This is, because the various Ext and App classes can all be defined in different translation units.

But the same problem exists for the Application classes: Because App2 and App3 inherit a virtualized Base via the Extension class(es), they don't know at compile time, where that Base subobject is located within their own objects. So each method of App2 or App3 has to consult the virtual table to find the location of the Base subobject within their local objects. This is, because it is syntactically legal to later combine those App classes further, as illustrated with the SuperApp class in the above hierarchy.

Also note, that there is a further penalty, if the Base class calls any virtual methods defined on the Extension or Application level. That's because the virtual method will be called with this pointing to a Base object, but they have to adjust this to the beginning of their own object by again consulting the virtual table. If an Extension or Application layer (virtual or non-virtual) method calls a virtual method defined on the Base class, that penalty is incurred twice: First for finding the Base subobject and then again for finding the real object relative from the Base subobject.

However, if we know, that a SuperApp combining several Apps won't be created, we can improve things a lot by declaring the App classes final:

class App1 final : public Base {};
class App2 final : public ExtA {};
class App3 final : public ExtB, public ExtC {};

// class SuperApp : public App2, public App3 {};   // illegal now!

Because final makes the layout immutable, methods of the Application classes don't need to go through a virtual table to find the Base subobject anymore. They just add the known constant offet to the this pointer, when calling any Base method. And virtual callbacks at the Application layer can fixup the this pointer easily again by subtracting a constant known offset (or even not fix it up at all and reference the various fields from the middle of the object instead). Methods of the Base class also don't incur any penalty upon themselves, because inside that class, everything works normal. So in this three-level scenario with finalized classes on the outmost level, only the execution of methods on the Extensions level is slower, if they need to refer to fields or methods of the Base class, or if they are virtually called from the Base.

The backdraw of the final keyword is, that it disallows all extensions. You cannot derive an App2a from App2 anymore, even, if it doesn't require any of those Extensions. And declaring a non-final App2Base and then final App2a and App2b from it, would again incur penalties for all the methods in App2Base, that refer to the original Base. Unfortunately, the C++ Gods didn't give us a way to just unvirtualize a base class, but leave non-virtual extensions possible. They also didn't give us a way to declare a "master" Extension class, whose layout stays fixed, even if other Extensions with the same virtual Base class are also added (in this case, all the non-master Extensions would refer to the Base subobject within the master Extension).

The alternative to virtual inheritance like this is usually to add all the extension stuff to the Base class. Depending on the application, that might require a lot of extra and often unused fields and/or a lot of extra virtual method calls and/or a lot of dynamic_casts, which all come with a performance penalty, too.

Also note, that in modern CPUs, the penalty after a mispredicted virtual function call is much higher than the penalty after a mispredicted this pointer fixup. The first needs to throw away all results obtained on the wrong execution path and restart afresh on the right path. The later still needs to repeat all opcodes depending directly or indirectly on this, but doesn't need to load and decode instructions again. BTW: The speculative execution with unknown pointer fixups is one of the reasons, why CPUs are vulnerable to Spectre/Meltdown type data leaks.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号