Programming language evolves always along with Compiler's evolvement

The Semantics of Data

  • The size of an empty base class or an empty derived class inherited from an empty base class is not 0. For some reason: virtual pointer to virtual function table, or virtual pointer to virtual base class, or condition checking like if ( &a == &b), or alignment depended on platform.
  • Three main interplay factors:
    1. Language support overhead (vptr);
    2. Compiler optimization of recognized special case;
    3. Alignment constraints, machine dependent.
  • The empty virtual base class has become a common idiom of OO design under C++, it provides a virtual interface without defining any data.
  • This potential difference between compilers illustrates the evolutionary nature of the C++ Object Model. The model provides for the general case. As special cases are recognized over time, this or that heuristic is introduced to provide optimal handling. If successful, the heuristic is raised to common practice and becomes incorporated across implementations. It becomes thought of as standard, although it is not prescribed by the Standard, and over time it is likely be thought of as part of the language.
  • The virtual function table is a good example of this. Another is the named return value (NRV) optimization discussed.
  • A virtual base class subobject occurs only once in the derived class regardless of the number of times it occurs within the class inheritance hierarchy.
  • Non-static data members hold the values of individual class objects; static data member hold values of interest to the class as a whole.
  • The C++ object model representation for non-static data members optimizes for space and access time (and to preserve compatibility with the C language layout of the C struct) by storing the members directly within each class object.
  • This is also true for the inherited non-static data members of both virtual and non-virtual base classes, although the ordering of their layout is left undefined.
  • Static data member are maintained within the global data segment of the program and do not affect the size of individual class object, member functions do so.
  • The static data members of a template class behave slightly different.
  • Each class object, then, is exactly the size necessary to contain the non-static data member of its class. This size may at times surprise you as being larger than necessary. This girth comes about in two ways:
    1. Additional data members added by the compilation system to support some language functionality (primarily the virtuals);
    2. Alignment requirement on the data members and data structures as a whole.

The Binding of a Data Member

  • The language rule back then was refereed to as the “member rewriting rule” and stated generally that the body of an inline function is not evaluated until after the entire class declaration is seen.
  • The Standard C++ refined the rewriting rule with a tuple of member scope resolution rules. The effect is still to evaluate the body of an inline member function as if it had been defined immediately following the class declaration.
  • Thus the binding of a data member within the body of an inline member function does not occur until after the entire class declaration is seen. This is not true of the argument list of the member function, however within the argument list are still resolved in place at the point they are first encountered.
  • Class::GetLength( int length) { return length; } // length is bound to argument, even if the class has length member.
  • Non-intuitive bindings between extern and nested type names, therefore, can still occur.
  • When the subsequent declaration of the nested typedef of length is encountered, the Standard C++ requires that the earlier bindings be flagged as illegal.
  • This aspect of the language still requires the general defensive programming style of always placing nested type declarations at the beginning of the class. In our example, placing the nested typedef defining length above any of its uses within the class corrects the non-intuitive binding.

Data Member Layout

  • The non-static data members are set down in the order of their declaration within each class object.
  • Any intervening static data members are ignored, are stored in the program’s data segment independent of individual class objects.
  • The Standard requires within an access section (the private, public, or protected section of a class declaration) only that the members be set down such that "later members have higher addresses within a class object"
  • That is, the members are not required to be set down contiguously.
  • Alignment constraints on the type of a succeeding member may require padding
  • Additionally, the compiler may synthesize one or more additional internal data members in support of the Object Model. The vptr, for example, is one such synthesized data member that all current implementations insert within each object of a class containing one or more virtual functions.
  • The Standard, by phrasing the layout requirement as it does, allows the compiler the freedom to insert these internally generated members anywhere, even between those explicitly declared by the programmer.
  • The Standard also allows the compiler the freedom to order the data members within multiple access sections within a class in whatever order it sees fit.
  • The order of members of class object is now implementation dependent.
  • No overhead is incurred by the access section specifier or the number of access levels.

Access of a Data Member

  • object.dataMember = 0; What is the cost of accessing the data member?
  • The answer depends both on how data member and the class are declared. Data member can be either a static or non-static member. Object’s class can be an independent class or be derived from a single base class. Less likely, but still possible, it can be either multiply or virtually derived.

Access to Static Data Members

  • Static data members are literally lifted out of their class, and treated as if each were declared as a global variable(but with visibility limited to the scope of the class). Note: the global variable is initialized before main thread, it is implemented by compiler.
  • Each member’s access permission and class association is maintained without incurring any space or runtime overhead either in the individual class objects or in the static data member itself. It is totally checked by compiler at compiling-time. At linking-time in Java.
  • A single instance of each class static data member is stored within the data segment of the program. Each reference to the static member is internally translated to be a direct reference of that single extern instance.
  • This is the only case in the language where the access of a member through a pointer and through an object are exactly equivalent in terms of the instructions actually executed. This is because the access of a static data member through the member selection operators is a syntactic convenience only.
  • The member is not within the class object, and therefore the class object is not necessary for the access.
  • What if static data member is an inherited member of a complex inheritance hierarchy, perhaps the member of a virtual base class of a virtual base class, or some other equally complex hierarchy? It doesn’t matter. There is still only a single instance of the member within the program, and its access is direct.
  • What if the access of the static data member is through a function call or some other form of expression? In cfront, it was simply discard. Standard C++ explicitly requires that function shall be evaluated, although no use is made of its result.
  • Taking the address of a static data member yields an ordinary pointer of its data type, not a pointer to class member, since the static member is not contained within a class object.
  • const int * p = &Class::intStaticDataMember;
  • The two important aspects of any name-mangling scheme are: 1. the algorithm yields unique names; 2. those unique names can be easily recast back to the original name in case the compilation system needs to communicate with the user.

Access to Non-Static Data Member

  • Non-static data members are stored directly within each class object and cannot be accessed except through an explicit or implicit class object. An implicit class object is present whenever the programmer directly accesses a non-static data member within a member function.
  • The seemingly direct access of non-static data member is actually carried out through an implicit class object represented by the this pointer
  • Access of a non-static data member requires the addition of the beginning address of the class object with the offset location of the data member. this + ( this->y – 1 );
  • Notice the peculiar "subtract by one" expression applied to the pointer-to-data-member offset value. Offset values yielded by the pointer-to-data-member syntax are always bumped up by one. Doing this permits the compilation system to distinguish between a pointer to data member that is addressing the first member of a class and a pointer to data member that is addressing no member. Pointers to data members are discussed in more details.
  • The offset of each nonstatic data member is known at compile time, even if the member belongs to a base class subobject derived through a single or multiple inheritance chain. Access of a nonstatic data member, therefore, is equivalent in performance to that of a C struct member or the member of a nonderived class
  • Virtual inheritance introduces an additional level of indirection in the access of its members through a base class subobject.
  • ever significantly different when accessed through the object origin or the pointer pt? The answer is the access is significantly different when the Point3d class is a derived class containing a virtual base class within its inheritance hierarchy and the member being accessed, such as x, is an inherited member of that virtual base class. In this case, we cannot say with any certainty which class type pt addresses (and therefore we cannot know at compile time the actual offset location of the member), so the resolution of the access must be delayed until runtime through an additional indirection. This is not the case with the object origin. Its type is that of a Point3d class, and the offset location of even inherited virtual base class members are fixed at compile time. An aggressive compiler can therefore resolve the access of x through origin statically.

Inheritance and the Data Member

  • Under the C++ inheritance model, a derived class object is represented as the concatenation of its members with those of its base classes. The actual ordering of the derived and base class parts is left unsepecified by the Standard. In theory, a compiler is free to place either the base or the derived part first in the derived class object.
  • In practice, the base class members always appear first, except in the case of a virtual base class.
  • In general, the handling of a virtual base class is an exception to all generalities, even of course, this one.
  • Layout of data member of class object is dependent on:
    • 1. single inheritance without virtual functions;
    • 2. single inheritance with virtual functions;
    • 3. multiple inheritance;
    • 4. virtual inheritance.
  • In the absence of virtual functions, they are equivalent to C struct declarations

Inheritance without Polymorphism:

  • concrete inheritance adds no space or access-time overhead to the representation, its layout is akin to C struct representation. Pitfall of alignment requirement, and wrongness of member-wise copy constructor or assignment operator.
  • Would overwrite the values of the packed inherited members. It would be an enormous effort on the user's part to debug this, to say the least.

Inheritance with virtual functions for polymorphism:

  • This flexibility, of course, is the heart of OO programming. Support for this flexibility, however, does introduce a number of space and access-time overheads.
  • Introduction of a virtual table associated with class to hold the address of each virtual function. The size of this table in general is the number of virtual functions declared plus an additional one or two slots to support RTTI.
  • Introduction of the vptr within each class object. The vptr provides the runtime link for an object to efficiently find its associated virtual table.
  • Augmentation of the constructor to initialize the object’s vptr to the virtual table of the class. Depending on the aggressiveness of the compiler’s optimization, this may mean resetting the vptr within the derived and each base class constructor.
  • Augmentation of the destructor to reset the vptr to the associated virtual table of the class. (It is likely to have been set to address the virtual table of the derived class within the destructor of the derived class. Remember, the order of destructor calls is in reverse: derived class and then base class.) An aggressive optimizing compiler can suppress a great many of these assignments.
  • In general, the destructor shall be declared virtual in order to be put in the virtual function table, when it is invoked, the base class’ destructor is automatically called by compiler

Multiple Inheritance

  • Single inheritance provides a form of “natural” polymorphism regarding the conversion between base and derived types within the inheritance hierarchy.
  • Multiple inheritance is neither as well behaved nor as easily modeled as single inheritance. The complexity of multiple inheritance lies in the “unnatural” relationship of the derived class with its second and subsequent base class sub-objects.
  • The problem of multiple inheritance primarily affects conversions between the derived and second or subsequent base class object.
  • The assignment of the address of a multiply derived object to a pointer of its leftmost base class is the same as that for single inheritance, sine both point to the same beginning address.
  • The assignment of the address of a second or subsequent base class, however, requires that that address be modified by the addition( or subtraction in the case of a downcast) of the size of the intervening base class sub-objects.
  • The Standard does not require a specific ordering of the Point3d and Vertex base classes of Vertex3d. The original cfront implementation always placed them in the order of declaration. A Vertex3d object under cfront, therefore, consisted of the Point3d subobject (which itself consisted of a Point2d subobject), followed by the Vertex subobject and finally by the Vertex3d part. In practice, this is still how all implementations lay out the multiple base classes (with the exception of virtual inheritance).
  • An optimization under some compilers, however, such as the MetaWare compiler, switch the order of multiple base classes if the second (or subsequent) base class declares a virtual function and the first does not. This shuffling of the base class order saves the generation of an additional vptr within the derived class object. There is no universal agreement among implementations about the importance of this optimization, and use of this optimization is not (at least currently) widespread.
  • What about access of a data member of a second or subsequent base class? Is there an additional cost? No. The member's location is fixed at compile time. Hence its access is a simple offset the same as under single inheritance regardless of whether it is a pointer, reference, or object through which the member is being accessed.

Virtual Inheritance

  • A semantic side effect of multiple inheritance is the need to support a form of shared subobject inheritance.
  • We need only a single base class sub-object. The language level solution is the introduction of virtual inheritance.
  • As complicated as the semantics of virtual inheritance may seem, its support within the compiler has proven even more complicated. In our iostream example, the implementational challenge is to find a reasonably efficient method of collapsing the two instances of an ios subobject maintained by the istream and ostream classes into a single instance maintained by the iostream class, while still preserving the polymorphic assignment between pointers (and references) of base and derived class objects.
  • The general implementation solution is as follows. A class containing one or more virtual base class subobjects, such as istream, is divided into two regions: an invariant region and a shared region. Data within the invariant region remains at a fixed offset from the start of the object regardless of subsequent derivations. So members within the invariant region can be accessed directly. The shared region represents the virtual base class subobjects. The location of data within the shared region fluctuates with each derivation. So members within the shared region need to be accessed indirectly. What has varied among implementations is the method of indirect access.
  • The general layout strategy is to first lay down the invariant region of the derived class and then build up the shared region. However, one problem remains: How is the implementation to gain access to the shared region of the class? In the original cfront implementation, a pointer to each virtual base class is inserted within each derived class object. Access of the inherited virtual base class members is achieved indirectly through the associated pointer.
  • There are two general solutions to the first problem. Microsoft's compiler introduced the virtual base class table. Each class object with one or more virtual base classes has a pointer to the virtual base class table inserted within it. The actual virtual base class pointers, of course, are placed within the table. Although this solution has been around for many years, I am not aware of any other compiler implementation that employs it. (It may be that Microsoft's patenting of their virtual function implementation effectively prohibits its use.)
  • The second solution, and the one preferred by Bjarne (at least while I was working on the Foundation project with him), is to place not the address but the offset of the virtual base class within the virtual function table.

Pointer to Data Members

  • Pointers to data members are a somewhat arcane but useful feature of the language, particularly if you need to probe at the underlying member layout of a class. One example of such a probing might be to determine if the vptr is placed at the beginning or end of the class. A second use, presented in Section, might be to determine the ordering of access sections within the class. As I said, it's an arcane, although potentially useful, language feature.

From: <<Inside the C++ Object Model>>

C++ Knowledge series 3的更多相关文章

  1. Java Knowledge series 4

    JVM & Bytecode Has-a or Is-a relationship(inheritance or composition) 如果想利用新类内部一个现有类的特性,而不想使用它的接 ...

  2. C++ Knowledge series 1

    Programming language evolves always along with Compiler's evolvement. 1. The C++ Object Model: Strou ...

  3. C++ Knowledge series Template & Class

    Function Function is composed of name, parameter (operand, type of operand), return value, body with ...

  4. C++ Knowledge series Inheritance & RTTI & Exception Handling

    Inheritance The pointer or reference to base class can address/be assigned with any of the classes d ...

  5. C++ Knowledge series Conversion & Constructor & Destructor

    Everything has its lifecycle, from being created to disappearing. Pass by reference instead of pass ...

  6. C++ Knowledge series STL & Const

    Thank to the pepole who devote theirself to the common libs. STL(http://www.cplusplus.com/reference/ ...

  7. Java Knowledge series 7

    Pepole who make a greate contribution on common libaraies deserve our respect. Component(Widget) / S ...

  8. C++ Knowledge series 2

    Programming language evolves always along with Compiler's evolvement The semantics of constructors O ...

  9. Java Knowledge series 5

    Interface from user, not from implementor.(DIP) Interface-Oriented Programming. Interface or Abstrac ...

  10. Java Knowledge series 3

    JVM & Bytecode Abstract & Object Object in Java (1) 所有东西都是对象object.可将对象想象成一种新型变量:它保存着数据,但可要求 ...

随机推荐

  1. Mybatis学习笔记(二) —— mybatis入门程序

    一.mybatis下载 mybaits的代码由github.com管理,下载地址:https://github.com/mybatis/mybatis-3/releases 下载完后的目录结构: 二. ...

  2. P3225 [HNOI2012]矿场搭建

    传送门 对于一个点双联通分量,如果它连接了两个或更多割点 那么不论哪个点GG都有至少一条路通到其他的点双联通分量,所以我们不用考虑 如果它只连接一个割点,如果这个割点GG,那整个块也一起GG,所以要再 ...

  3. Codeforces Round #335 (Div. 2) A

    A. Magic Spheres time limit per test 2 seconds memory limit per test 256 megabytes input standard in ...

  4. django终端打印Sql语句

    LOGGING = { 'version': 1, 'disable_existing_loggers': False, 'handlers': { 'console':{ 'level':'DEBU ...

  5. day 012 生成器 与 列表推导式

    生成器的本质就是迭代器,写法和迭代器不一样,用法一样. 获取方法: 1.通过生成器函数 2.通过各种推导式来实现生成器 3.通过数据的转换也可以获取生成器 例如: 更改return 为 yield 即 ...

  6. 一行CMD命令kill(杀)掉你的进程

    查看进程 pi@raspberry:~ $ ps -ef | grep python3 UID PID PPID C STIME TTY TIME CMD pi 4678 4666 0 11:57 p ...

  7. java——最大堆 MaxHeap

    使用数组来实现最大堆 堆是平衡二叉树 import Date_pacage.Array; public class MaxHeap<E extends Comparable <E>& ...

  8. python3 练习3

    ##c##写法 #include<iostream>using namespace std;class Rectangle{public:    int j;void area(int X ...

  9. mysql查看sql语句的设置

    SHOW VARIABLES LIKE "general_log%"; SET GLOBAL general_log_file = 'D:\\mysql.log'; SET GLO ...

  10. js学习笔记 -- 函数

    js函数有类似javaMethod用法 Math.max.apply( Math.max.call( Array map,reduce,filter,sort , , , , , , , , ]; v ...