The page “Notes on Multiple Inheritance in GCC C++ Compiler v4.0.1” is offline now, and http://web.archive.org didn’t archive it. So, I have found a copy of the text at tinydrblog which is archived at the web archive.
There is full text of the original Notes, published online as part of “Doctoral Programming Language Seminar: GCC Internals” (fall 2005) by graduate Morgan Deters “in the Distributed Object Computing Laboratory in the Computer Science department at Washington University in St Louis.”
His (archived) homepage:
THIS IS THE TEXT by Morgan Deters and NOT CC-licensed.
Morgan Deters webpages:
- http://web.archive.org/web/20060908050947/http://www.cse.wustl.edu/~mdeters/,
- His biography http://web.archive.org/web/20060910122623/http://www.cse.wustl.edu/~mdeters/bio/,
- Google Scholar profile: https://scholar.google.com/citations?user=DsD8SDYAAAAJ&hl=en&oi=sra
- More recent webpage https://cs.nyu.edu/~mdeters/
- Obituary from 2015: http://www.sent-trib.com/obituaries/dr-morgan-g-deters/article_70a9b22a-a307-11e4-ba08-476415c7fb1c.html “Dr. Morgan G. Deters, PhD, 35, formerly of Bowling Green and more recently of Brooklyn, NY, died Saturday, January 17, 2015 in Tobago, Trinidad.”
- http://cvc4.cs.stanford.edu/web/in-memoriam-morgan-deters/
PART1:
The Basics: Single Inheritance
As we discussed in class, single inheritance leads to an object layout with base class data laid out before derived class data. So if classes
A
andB
are defined thusly:class A { public: int a;
};
class B : public A { public: int b; };
then objects of type
B
are laid out like this (where “b” is a pointer to such an object):b --> +-----------+ | a | +-----------+ | b | +-----------+
If you have virtual methods:
class A { public: int a; virtual void v(); }; class B : public A { public: int b; };
then you’ll have a vtable pointer as well:
+-----------------------+ | 0 (top_offset) | +-----------------------+ b --> +----------+ | ptr to typeinfo for B | | vtable |-------> +-----------------------+ +----------+ | A::v() | | a | +-----------------------+ +----------+ | b | +----------+
that is,
top_offset
and the typeinfo pointer live above the location to which the vtable pointer points.Simple Multiple Inheritance
Now consider multiple inheritance:
class A { public: int a; virtual void v(); }; class B { public: int b; virtual void w(); }; class C : public A, public B { public: int c; };
In this case, objects of type C are laid out like this:
+-----------------------+ | 0 (top_offset) | +-----------------------+ c --> +----------+ | ptr to typeinfo for C | | vtable |-------> +-----------------------+ +----------+ | A::v() | | a | +-----------------------+ +----------+ | -8 (top_offset) | | vtable |---+ +-----------------------+ +----------+ | | ptr to typeinfo for C | | b | +---> +-----------------------+ +----------+ | B::w() | | c | +-----------------------+ +----------+
…but why? Why two vtables in one? Well, think about type substitution. If I have a pointer-to-C, I can pass it to a function that expects a pointer-to-A or to a function that expects a pointer-to-B. If a function expects a pointer-to-A and I want to pass it the value of my variable c (of type pointer-to-C), I’m already set. Calls to
A::v()
can be made through the(first) vtable, and the called function can access the member a through the pointer I pass in the same way as it can through any pointer-to-A.However, if I pass the value of my pointer variable
c
to a function that expects a pointer-to-B, we also need a subobject of type B in our C to refer it to. This is why we have the second vtable pointer. We can pass the pointer value(c + 8 bytes) to the function that expects a pointer-to-B, and it’s all set: it can make calls toB::w()
through the (second) vtable pointer, and access the member b through the pointer we pass in the same way as it can through any pointer-to-B.Note that this “pointer-correction” needs to occur for called methods too. Class
C
inheritsB::w()
in this case. Whenw()
is called on through a pointer-to-C, the pointer (which becomes the this pointer inside ofw()
needs to be adjusted. This is often called this pointer adjustment.In some cases, the compiler will generate a thunk to fix up the address. Consider the same code as above but this time
C
overridesB
‘s member functionw()
:class A { public: int a; virtual void v(); }; class B { public: int b; virtual void w(); }; class C : public A, public B { public: int c; void w(); };
C
‘s object layout and vtable now look like this:+-----------------------+ | 0 (top_offset) | +-----------------------+ c --> +----------+ | ptr to typeinfo for C | | vtable |-------> +-----------------------+ +----------+ | A::v() | | a | +-----------------------+ +----------+ | C::w() | | vtable |---+ +-----------------------+ +----------+ | | -8 (top_offset) | | b | | +-----------------------+ +----------+ | | ptr to typeinfo for C | | c | +---> +-----------------------+ +----------+ | thunk to C::w() | +-----------------------+
Now, when
w()
is called on an instance ofC
through a pointer-to-B, the thunk is called. What does the thunk do? Let’s disassemble it (here, withgdb
):0x0804860c <_ZThn8_N1C1wEv+0>: addl $0xfffffff8,0x4(%esp) 0x08048611 <_ZThn8_N1C1wEv+5>: jmp 0x804853c <_ZN1C1wEv>
So it merely adjusts the
this
pointer and jumps toC::w()
. All is well.But doesn’t the above mean that
B
‘s vtable always points to thisC::w()
thunk? I mean, if we have a pointer-to-B that is legitimately aB
(not aC
), we don’t want to invoke the thunk, right?Right. The above embedded vtable for
B
inC
is special to the B-in-C case. B’s regular vtable is normal and points toB::w()
directly.The Diamond: Multiple Copies of Base Classes (non-virtual inheritance)
Okay. Now to tackle the really hard stuff. Recall the usual problem of multiple copies of base classes when forming an inheritance diamond:
class A { public: int a; virtual void v(); }; class B : public A { public: int b; virtual void w(); }; class C : public A { public: int c; virtual void x(); }; class D : public B, public C { public: int d; virtual void y(); };
Note that
D
inherits from bothB
andC
, andB
andC
both inherit fromA
. This means thatD
has two copies ofA
in it. The object layout and vtable embedding is what we would expect from the previous sections:+-----------------------+ | 0 (top_offset) | +-----------------------+ d --> +----------+ | ptr to typeinfo for D | | vtable |-------> +-----------------------+ +----------+ | A::v() | | a | +-----------------------+ +----------+ | B::w() | | b | +-----------------------+ +----------+ | D::y() | | vtable |---+ +-----------------------+ +----------+ | | -12 (top_offset) | | a | | +-----------------------+ +----------+ | | ptr to typeinfo for D | | c | +---> +-----------------------+ +----------+ | A::v() | | d | +-----------------------+ +----------+ | C::x() | +-----------------------+
Of course, we expect
A
‘s data (the membera
) to exist twice inD
‘s object layout (and it is), and we expectA
‘s virtual member functions to be represented twice in the vtable (andA::v()
is indeed there). Okay, nothing new here.The Diamond: Single Copies of Virtual Bases
But what if we apply virtual inheritance? C++ virtual inheritance allows us to specify a diamond hierarchy but be guaranteed only one copy of virtually inherited bases. So let’s write our code this way:
class A { public: int a; virtual void v(); }; class B : public virtual A { public: int b; virtual void w(); }; class C : public virtual A { public: int c; virtual void x(); }; class D : public B, public C { public: int d; virtual void y(); };
All of a sudden things get a lot more complicated. If we can only have one copy of
A
in our representation ofD
, then we can no longer get away with our “trick” of embedding aC
in aD
(and embedding a vtable for theC
part ofD
inD
‘s vtable). But how can we handle the usual type substitution if we can’t do this?Let’s try to diagram the layout:
+-----------------------+ | 20 (vbase_offset) | +-----------------------+ | 0 (top_offset) | +-----------------------+ | ptr to typeinfo for D | +----------> +-----------------------+ d --> +----------+ | | B::w() | | vtable |----+ +-----------------------+ +----------+ | D::y() | | b | +-----------------------+ +----------+ | 12 (vbase_offset) | | vtable |---------+ +-----------------------+ +----------+ | | -8 (top_offset) | | c | | +-----------------------+ +----------+ | | ptr to typeinfo for D | | d | +-----> +-----------------------+ +----------+ | C::x() | | vtable |----+ +-----------------------+ +----------+ | | 0 (vbase_offset) | | a | | +-----------------------+ +----------+ | | -20 (top_offset) | | +-----------------------+ | | ptr to typeinfo for D | +----------> +-----------------------+ | A::v() | +-----------------------+
Okay. So you see that
A
is now embedded inD
in essentially the same way that other bases are. But it’s embedded in D rather than inits directly-derived classes.