Alternative implementations of the virtual mechanism?

C ++ supports dynamic binding through a virtual mechanism. But since I understand that the virtual mechanism is a detail of the compiler implementation, and the standard simply defines the behavior of what should happen in certain scenarios. Most compilers implement a virtual mechanism through a virtual table and virtual pointer. And yes, I know how it works. So my question is not about implementation details of virtual pointers and tables. My questions:

  • Are there any compilers that implement Virtual Mechanism in any way other than a virtual pointer and a virtual table engine? As far as I saw most (read g ++, Microsoft visual studio) implement it through a virtual table, a pointer mechanism. Are there practically any other compiler implementations?
  • sizeof any class with a virtual function will be the size of the pointer (vptr inside this ) on this compiler. Therefore, if the virtual mechanism ptr and tbl itself is a compiler implementation, is this expression that I made above is always true?
+47
c ++ dynamic virtual vtable
Dec 04 '10 at 5:02
source share
11 answers

It is not true that vtable pointers in objects are always the most efficient. My compiler for another language, used to use pointers in objects for similar reasons, but no longer does: instead, it uses a separate data structure that maps the address of the object to the required metadata: on my system, this happens as form information for use by the garbage collector .

This implementation requires a bit more storage for one simple object, more efficient for complex objects with many databases, and more efficient for arrays, since only one entry in the mapping table is required for all objects in the array. My specific implementation can also find metadata with a pointer to any point inside the object.

The actual search is very fast, and the storage requirements are very modest, because I use the best data structure on the planet: Judy arrays.

I also don't know the C ++ compiler using anything but vtable pointers, but this is not the only way. In fact, the initialization semantics for classes with bases makes any implementation messy. This is because the full type must see around around the object. As a consequence of this semantics, complex mixin objects result in massive sets of generated vtables, large objects, and slow object initialization. This is probably not a consequence of vtable technology, since it requires a split compliance with the requirement that the type of execution time of a subobject is correct at all times. In fact, there is no good reason for this during construction, since the constructors are not methods and cannot reasonably use virtual dispatching: this is not so clear to me for destruction, since destructors are real methods.

+19
Dec 07 '10 at 20:57
source share

As far as I know, in all C ++ implementations, the vtable pointer is used, although it would be quite easy (and perhaps not as bad as you might think that this cache) to save a small type index in the object (1- 2 B), and then get the vtable and type information with a little table scan.

Another interesting approach might be BIBOP (http://foldoc.org/BIBOP) - a large package of pages, although it will have problems for C ++. Idea: place objects of the same type on the page. Get a pointer to a descriptor like / vtable at the top of the page, simply by canceling the less significant bits of the object pointer. (Doesn't work well for objects on the stack, of course!)

Another other approach is to encode certain type tags / indices in the object pointers themselves. For example, if by construction all objects are aligned by 16 bytes, you can use the 4 least significant bits to place a tag of type 4-bit type. (Actually not enough.) Or (especially for embedded systems), if you guaranteed unused more significant bits in the addresses, you can add more tag bits there and restore them with a shift and a mask.

Although both of these schemes are interesting (and sometimes used) for other language implementations, they are problematic for C ++. Certain specific C ++ semantics, such as overriding virtual functions of a base class, are called when constructing and destroying objects (a base class), leading you to a model in which there is a certain state in the object that you change when you enter the ctors / dtors base class.

You can find my old tutorial on implementing the Microsoft C ++ object model. http://www.openrce.org/articles/files/jangrayhood.pdf

Happy hack!

+7
Dec 12 '10 at 4:42
source share
  • I don't think there are modern compilers with an approach other than vptr / vtable. In fact, it would be difficult to understand something else that is not just ineffective.

    However, there is still a fairly large room for compromise as part of this approach. Perhaps, especially with regard to how virtual inheritance is handled. Therefore, it makes sense to make this implementation specific.

    If you are interested in such material, I highly recommend reading Inside the C ++ Object Model .

  • sizeof class is compiler dependent. If you want to port code, make no assumptions.

+5
Dec 04 '10 at
source share

Are there any compilers that implement Virtual Mechanism in any other way than the virtual pointer and the virtual table engine? As far as I saw most (read g ++, Microsoft visual studio) implement it through a virtual table, a pointer mechanism. Are there practically any other compiler implementations?

All current compilers that I know of use the vtable mechanism.

This is an optimization that is possible because C ++ is checked statically.

In some more dynamic languages, instead, there is a dynamic search through a chain of base classes, a search for the implementation of a member function, which is called virtually, starting with the most derived object class. For example, how it works in the original Smalltalk. And the C ++ standard describes the effect of a virtual call, as if such a search were used.

At Borland / Turbo Pascal in 1990, this dynamic search was used to search for Windows API window message handlers. And I think maybe the same thing in Borland C ++. This was in addition to the regular vtable mechanism used exclusively for message handlers.

If it was used in Borland / Turbo C ++ - I can’t remember - then it supported language extensions that allowed you to associate a message identifier with message handler functions.

The size of any class with a virtual function will be the size of the pointer (vptr inside this) in this compiler. Therefore, given that the virtual mechanism of ptr and tbl itself is an implementation of the compiler, will this statement I made above is always true?

Formally, no (even with the assumption of the vtable mechanism), it depends on the compiler. Since the standard does not require a vtable mechanism, it does not say anything about placing a vtable pointer in every object. And other rules allow the compiler to freely add padding, unused bytes at the end.

But in practice it is possible .; -)

However, this is not something you should rely on, or what you need to rely on. But in a different direction, this may be required, for example, if you define an ABI. Then any compiler that does not do this simply does not meet your requirements.

Cheers and hth.,

+5
Dec 07 '10 at
source share

Trying to present an alternative circuit, I came up with the following, according to Yttril's answer. As far as I know, the compiler does not use it!

Given a sufficiently large virtual address space and flexible routines for allocating OS memory, it would be possible for new allocate objects of different types in fixed, non-overlapping address ranges. Then the type of the object could be quickly deduced from its address using the shift operation to the right and the result that was used to index the vtables table, thereby saving 1 vtable pointer for each object.

At first glance, this scheme may run into problems with objects related to stacks, but this can be handled cleanly:

  • For each object allocated for the stack, the compiler adds code that adds an entry to the global array of pairs (address range, type) when creating the object and deletes the entry when it is destroyed.
  • The address range containing the stack will be mapped to one vtable containing a large number of thunks that read the this pointer, scan the array to find the appropriate type (vptr) for the object at that address, and call the corresponding method in the vtable that it points to. (Ie the 42nd thunk will call the 42nd method in vtable - if most of the virtual functions used in any class are n , then at least n thunks are required.)

This scheme obviously carries non-trivial overhead (at least O (log n) for search) for invoking virtual methods on stack-based objects. In the absence of arrays or composition (containment within another object) of objects based on the stack, a simpler and faster approach can be used in which vptr is pushed onto the stack immediately in front of the object (note that it is not considered to be part of the object and does not affect its size, measured sizeof ). In this case, thunks simply subtracts sizeof (vptr) from this to find the correct vptr to use and forward as before.

+4
Dec 12 2018-10-12T00:
source share
  • I have never heard or seen a compiler that uses any alternative implementation. The reason vtables is so popular is because it is not only the most efficient implementation, but also the simplest design and the most obvious implementation.

  • Pretty much any compiler you want to use is almost certainly true. However, this is not guaranteed and not always true - you cannot depend on it, although this is almost always the case. Your favorite compiler can also change its alignment, increasing its size, for funsies without telling you. From memory, he can also insert any debugging information and everything that she likes.

+3
Dec 04 '10 at 12:05
source share

Are there any compilers that implement Virtual Mechanism in any other way than the virtual pointer and the virtual table engine? As far as I saw most (read g ++, Microsoft visual studio) implement it through a virtual table, a pointer mechanism. Are there any other compiler implementations in general?

No, what I know about C ++ compilers, although you might find it interesting to read about Binary Tree Dispatch. If you are interested in using the expectations of virtual dispatch tables in any way, you should know that compilers can - when types are known at compile time - sometimes allow calls to virtual functions at compile time, so they cannot access the table.

The size of any class with a virtual function will be the size of the pointer (vptr inside this) in this compiler. Therefore, given that the virtual mechanism of ptr and tbl itself is an implementation of the compiler, will this statement I made above is always true?

Assuming that base classes with their own virtual members are not virtual base classes, this is likely to be true. Alternatives can be envisioned, such as analyzing the entire program, showing only one member in the hierarchical class hierarchy and switching to sending compilation time. If runtime scheduling is required, it is hard to imagine why any compiler would introduce further indirection. However, the Standard does not intentionally define these things in such a way that implementations may change or vary in the future.

+3
Dec 07 '10 at 7:44
source share

C ++ / CLI deviates from both assumptions. If you define a ref class, it does not compile into machine code at all; instead, the compiler compiles it into managed .NET code. In an intermediate language, classes are a built-in function, and a set of virtual methods is defined in the metadata, and not in the method table.

The specific strategy for implementing the layout of the object and sending depends on the virtual machine. In Mono, an object containing only one virtual method does not have the size of a single pointer, but needs two pointers in a MonoObject struct ; second to synchronize the object. Since this is implementation-defined, and also not very useful to know, sizeof is not supported for ref classes in C ++ / CLI.

+3
Dec 09 '10 at 19:47
source share

IIRC Eiffel takes a different approach, and all method overrides are ultimately combined and compiled at the same address with the prolog where the object type is checked (therefore, each object must have a type identifier, but this is not a pointer to VMT). This for C ++ will require, of course, that the final function is created during the connection. However, I do not know the C ++ compiler that uses this approach.

+3
Dec 12 2018-10-12T00:
source share

Tony D's answer correctly indicates that compilers are allowed to use analysis of the entire program to replace a virtual function call with a static call with an implementation of a unique possible function; or compile obj->method() into the equivalent

 if (auto frobj = dynamic_cast<FrequentlyOccurringType>(obj)) { frobj->FrequentlyOccurringType::method(); // static dispatch on hot path } else { obj->method(); // vtable dispatch on cold path } 

Karel Driesen and Urs Kholzl (Karel Driesen) and Urs Kholzl (Karsl Driesen) and Urs Kholzl (Karsl Driesen) and Urs Kholzl (Karsl Driesen) and Urs Kryls Dzryl functions in C ++. " (A PDF file is available for free if you are Google for it.) Unfortunately, they only compare vtable sending and excellent static sending; they did not compare it with sending a binary tree.

They noted that there are actually two types of vtables when you talk about languages ​​(like C ++) that support multiple inheritance. In multiple inheritance, when you call a virtual method that is inherited from the second base class, you need to "fix" the object pointer so that it points to an instance of the second base class. This commit offset can be saved as data in the vtable, or it can be saved as code in thunk. (See the document for details.)

I believe that all decent compilers use thunks these days, but 10 or 20 years have passed for this market penetration to reach 100%.

0
Apr 25 '13 at 22:22
source share

First, the Borland extension for C ++, virtual dynamic sending tables (DDVT), was mentioned, and you can read something about it in a file called DDISPATC.ZIP . Borland Pascal had virtual and dynamic methods, and Delphi introduced yet another message syntax , similar to dynamic, but for messages. At the moment, I'm not sure that Borland C ++ had the same functions. There was no multiple inheritance in Pascal or Delphi, so Borland C ++ DDVT may be different from Pascal or Delphi.

Secondly, in the 1990s and a little earlier they experimented with various object models, and Borland was not the most advanced. I personally believe that the closure of IBM SOMobjects was detrimental to the world that we all still suffer from. Before closing SOM, there were experiments with Direct-to-SOM C ++ compilers. Therefore, instead of the C ++ call method, the SOM method is used. It looks a lot like C ++ vtable, with a few exceptions. First, to prevent the problem of a fragile base class, programs do not use offsets inside the vtable, because they do not know this offset. It may change if the base class introduces new methods. Instead, callers invoke a thunk created at runtime that has this knowledge in its assembler code. And there is one more difference. In C ++, when multiple inheritance is used, an object can contain multiple VMT IIRCs. Unlike C ++, each SOM object has only one VMT, so the send code must be different from "call dword ptr [VMT + offset]".

There is a document related to SOM, Compatibility with the binary version of Release-to-Release in SOM . You can find a comparison of SOM with other projects that I know little, such as Delta / C ++ and Sun OBI . They solve a subset of the problems that SOM solves, and thus they also have a slightly modified activation code.

I recently found Visual Age C ++ v3.5 for a fragment of the Windows compiler enough to make everything work and actually touch it. Most users are unlikely to get the OS / 2 VM to just play with DTS C ++, but having a Windows compiler is a different matter. VAC v3.5 is the first and latest version to support the Direct-to-SOM C ++ feature. VAC v3.6.5 and v4.0 are not suitable.

  • Download VAC 3.5 fixpak 9 from IBM FTP. This fixpak contains many files, so you don’t even need a full compiler (I have distribution 3.5.7, but fixpak 9 was big enough to run some tests).
  • Unzip to e. C: \ house \ octagram \ DTS
  • Run the command line and run the following commands there
  • Run: set SOMBASE = C: \ home \ OCTAGRAM \ DTS \ ibmcppw
  • Run: C: \ home \ OCTAGRAM \ DTS \ ibmcppw \ bin \ SOMENV.BAT
  • : cd C:\home\OCTAGRAM\DTS\ibmcppw\samples\compiler\dts
  • : nmake clean
  • : nmake
  • hhmain.exe dll , - ; , "set PATH =% PATH%; C:\home\OCTAGRAM\DTS\ibmcppw\samples\compiler\dts\xhmain\dtsdll" , dll hhmain.
  • : hhmain.exe

:

 Local anInfo->x = 5 Local anInfo->_get_x() = 5 Local anInfo->y = A Local anInfo->_get_y() = B {An instance of class info at address 0092E318 } 
0
07 . '14 15:38
source share



All Articles