Programming language evolves always along with Compiler's evolvement

The Semantics of Data

The size of an empty base class or an empty derived class inherited from an empty base class is not 0. For some reason: virtual pointer to virtual function table, or virtual pointer to virtual base class, or condition checking like if ( &a == &b), or alignment depended on platform.
Three main interplay factors:
1. Language support overhead (vptr);
2. Compiler optimization of recognized special case;
3. Alignment constraints, machine dependent.
The empty virtual base class has become a common idiom of OO design under C++, it provides a virtual interface without defining any data.
This potential difference between compilers illustrates the evolutionary nature of the C++ Object Model. The model provides for the general case. As special cases are recognized over time, this or that heuristic is introduced to provide optimal handling. If successful, the heuristic is raised to common practice and becomes incorporated across implementations. It becomes thought of as standard, although it is not prescribed by the Standard, and over time it is likely be thought of as part of the language.
The virtual function table is a good example of this. Another is the named return value (NRV) optimization discussed.
A virtual base class subobject occurs only once in the derived class regardless of the number of times it occurs within the class inheritance hierarchy.
Non-static data members hold the values of individual class objects; static data member hold values of interest to the class as a whole.
The C++ object model representation for non-static data members optimizes for space and access time (and to preserve compatibility with the C language layout of the C struct) by storing the members directly within each class object.
This is also true for the inherited non-static data members of both virtual and non-virtual base classes, although the ordering of their layout is left undefined.
Static data member are maintained within the global data segment of the program and do not affect the size of individual class object, member functions do so.
The static data members of a template class behave slightly different.
Each class object, then, is exactly the size necessary to contain the non-static data member of its class. This size may at times surprise you as being larger than necessary. This girth comes about in two ways:
1. Additional data members added by the compilation system to support some language functionality (primarily the virtuals);
2. Alignment requirement on the data members and data structures as a whole.

The Binding of a Data Member

The language rule back then was refereed to as the “member rewriting rule” and stated generally that the body of an inline function is not evaluated until after the entire class declaration is seen.
The Standard C++ refined the rewriting rule with a tuple of member scope resolution rules. The effect is still to evaluate the body of an inline member function as if it had been defined immediately following the class declaration.
Thus the binding of a data member within the body of an inline member function does not occur until after the entire class declaration is seen. This is not true of the argument list of the member function, however within the argument list are still resolved in place at the point they are first encountered.
Class::GetLength( int length) { return length; } // length is bound to argument, even if the class has length member.
Non-intuitive bindings between extern and nested type names, therefore, can still occur.
When the subsequent declaration of the nested typedef of length is encountered, the Standard C++ requires that the earlier bindings be flagged as illegal.
This aspect of the language still requires the general defensive programming style of always placing nested type declarations at the beginning of the class. In our example, placing the nested typedef defining length above any of its uses within the class corrects the non-intuitive binding.

Data Member Layout

The non-static data members are set down in the order of their declaration within each class object.
Any intervening static data members are ignored, are stored in the program’s data segment independent of individual class objects.
The Standard requires within an access section (the private, public, or protected section of a class declaration) only that the members be set down such that "later members have higher addresses within a class object"
That is, the members are not required to be set down contiguously.
Alignment constraints on the type of a succeeding member may require padding
Additionally, the compiler may synthesize one or more additional internal data members in support of the Object Model. The vptr, for example, is one such synthesized data member that all current implementations insert within each object of a class containing one or more virtual functions.
The Standard, by phrasing the layout requirement as it does, allows the compiler the freedom to insert these internally generated members anywhere, even between those explicitly declared by the programmer.
The Standard also allows the compiler the freedom to order the data members within multiple access sections within a class in whatever order it sees fit.
The order of members of class object is now implementation dependent.
No overhead is incurred by the access section specifier or the number of access levels.

Access of a Data Member

object.dataMember = 0; What is the cost of accessing the data member?
The answer depends both on how data member and the class are declared. Data member can be either a static or non-static member. Object’s class can be an independent class or be derived from a single base class. Less likely, but still possible, it can be either multiply or virtually derived.

Access to Static Data Members

Static data members are literally lifted out of their class, and treated as if each were declared as a global variable(but with visibility limited to the scope of the class). Note: the global variable is initialized before main thread, it is implemented by compiler.
Each member’s access permission and class association is maintained without incurring any space or runtime overhead either in the individual class objects or in the static data member itself. It is totally checked by compiler at compiling-time. At linking-time in Java.
A single instance of each class static data member is stored within the data segment of the program. Each reference to the static member is internally translated to be a direct reference of that single extern instance.
This is the only case in the language where the access of a member through a pointer and through an object are exactly equivalent in terms of the instructions actually executed. This is because the access of a static data member through the member selection operators is a syntactic convenience only.
The member is not within the class object, and therefore the class object is not necessary for the access.
What if static data member is an inherited member of a complex inheritance hierarchy, perhaps the member of a virtual base class of a virtual base class, or some other equally complex hierarchy? It doesn’t matter. There is still only a single instance of the member within the program, and its access is direct.
What if the access of the static data member is through a function call or some other form of expression? In cfront, it was simply discard. Standard C++ explicitly requires that function shall be evaluated, although no use is made of its result.
Taking the address of a static data member yields an ordinary pointer of its data type, not a pointer to class member, since the static member is not contained within a class object.
const int * p = &Class::intStaticDataMember;
The two important aspects of any name-mangling scheme are: 1. the algorithm yields unique names; 2. those unique names can be easily recast back to the original name in case the compilation system needs to communicate with the user.

Access to Non-Static Data Member

Non-static data members are stored directly within each class object and cannot be accessed except through an explicit or implicit class object. An implicit class object is present whenever the programmer directly accesses a non-static data member within a member function.
The seemingly direct access of non-static data member is actually carried out through an implicit class object represented by the this pointer
Access of a non-static data member requires the addition of the beginning address of the class object with the offset location of the data member. this + ( this->y – 1 );
Notice the peculiar "subtract by one" expression applied to the pointer-to-data-member offset value. Offset values yielded by the pointer-to-data-member syntax are always bumped up by one. Doing this permits the compilation system to distinguish between a pointer to data member that is addressing the first member of a class and a pointer to data member that is addressing no member. Pointers to data members are discussed in more details.
The offset of each nonstatic data member is known at compile time, even if the member belongs to a base class subobject derived through a single or multiple inheritance chain. Access of a nonstatic data member, therefore, is equivalent in performance to that of a C struct member or the member of a nonderived class
Virtual inheritance introduces an additional level of indirection in the access of its members through a base class subobject.
ever significantly different when accessed through the object origin or the pointer pt? The answer is the access is significantly different when the Point3d class is a derived class containing a virtual base class within its inheritance hierarchy and the member being accessed, such as x, is an inherited member of that virtual base class. In this case, we cannot say with any certainty which class type pt addresses (and therefore we cannot know at compile time the actual offset location of the member), so the resolution of the access must be delayed until runtime through an additional indirection. This is not the case with the object origin. Its type is that of a Point3d class, and the offset location of even inherited virtual base class members are fixed at compile time. An aggressive compiler can therefore resolve the access of x through origin statically.

Inheritance and the Data Member

Under the C++ inheritance model, a derived class object is represented as the concatenation of its members with those of its base classes. The actual ordering of the derived and base class parts is left unsepecified by the Standard. In theory, a compiler is free to place either the base or the derived part first in the derived class object.
In practice, the base class members always appear first, except in the case of a virtual base class.
In general, the handling of a virtual base class is an exception to all generalities, even of course, this one.
Layout of data member of class object is dependent on:
- 1. single inheritance without virtual functions;
- 2. single inheritance with virtual functions;
- 3. multiple inheritance;
- 4. virtual inheritance.
In the absence of virtual functions, they are equivalent to C struct declarations

Inheritance without Polymorphism:

concrete inheritance adds no space or access-time overhead to the representation, its layout is akin to C struct representation. Pitfall of alignment requirement, and wrongness of member-wise copy constructor or assignment operator.
Would overwrite the values of the packed inherited members. It would be an enormous effort on the user's part to debug this, to say the least.

Inheritance with virtual functions for polymorphism:

This flexibility, of course, is the heart of OO programming. Support for this flexibility, however, does introduce a number of space and access-time overheads.
Introduction of a virtual table associated with class to hold the address of each virtual function. The size of this table in general is the number of virtual functions declared plus an additional one or two slots to support RTTI.
Introduction of the vptr within each class object. The vptr provides the runtime link for an object to efficiently find its associated virtual table.
Augmentation of the constructor to initialize the object’s vptr to the virtual table of the class. Depending on the aggressiveness of the compiler’s optimization, this may mean resetting the vptr within the derived and each base class constructor.
Augmentation of the destructor to reset the vptr to the associated virtual table of the class. (It is likely to have been set to address the virtual table of the derived class within the destructor of the derived class. Remember, the order of destructor calls is in reverse: derived class and then base class.) An aggressive optimizing compiler can suppress a great many of these assignments.
In general, the destructor shall be declared virtual in order to be put in the virtual function table, when it is invoked, the base class’ destructor is automatically called by compiler

Multiple Inheritance

Single inheritance provides a form of “natural” polymorphism regarding the conversion between base and derived types within the inheritance hierarchy.
Multiple inheritance is neither as well behaved nor as easily modeled as single inheritance. The complexity of multiple inheritance lies in the “unnatural” relationship of the derived class with its second and subsequent base class sub-objects.
The problem of multiple inheritance primarily affects conversions between the derived and second or subsequent base class object.
The assignment of the address of a multiply derived object to a pointer of its leftmost base class is the same as that for single inheritance, sine both point to the same beginning address.
The assignment of the address of a second or subsequent base class, however, requires that that address be modified by the addition( or subtraction in the case of a downcast) of the size of the intervening base class sub-objects.
The Standard does not require a specific ordering of the Point3d and Vertex base classes of Vertex3d. The original cfront implementation always placed them in the order of declaration. A Vertex3d object under cfront, therefore, consisted of the Point3d subobject (which itself consisted of a Point2d subobject), followed by the Vertex subobject and finally by the Vertex3d part. In practice, this is still how all implementations lay out the multiple base classes (with the exception of virtual inheritance).
An optimization under some compilers, however, such as the MetaWare compiler, switch the order of multiple base classes if the second (or subsequent) base class declares a virtual function and the first does not. This shuffling of the base class order saves the generation of an additional vptr within the derived class object. There is no universal agreement among implementations about the importance of this optimization, and use of this optimization is not (at least currently) widespread.
What about access of a data member of a second or subsequent base class? Is there an additional cost? No. The member's location is fixed at compile time. Hence its access is a simple offset the same as under single inheritance regardless of whether it is a pointer, reference, or object through which the member is being accessed.

Virtual Inheritance

A semantic side effect of multiple inheritance is the need to support a form of shared subobject inheritance.
We need only a single base class sub-object. The language level solution is the introduction of virtual inheritance.
As complicated as the semantics of virtual inheritance may seem, its support within the compiler has proven even more complicated. In our iostream example, the implementational challenge is to find a reasonably efficient method of collapsing the two instances of an ios subobject maintained by the istream and ostream classes into a single instance maintained by the iostream class, while still preserving the polymorphic assignment between pointers (and references) of base and derived class objects.
The general implementation solution is as follows. A class containing one or more virtual base class subobjects, such as istream, is divided into two regions: an invariant region and a shared region. Data within the invariant region remains at a fixed offset from the start of the object regardless of subsequent derivations. So members within the invariant region can be accessed directly. The shared region represents the virtual base class subobjects. The location of data within the shared region fluctuates with each derivation. So members within the shared region need to be accessed indirectly. What has varied among implementations is the method of indirect access.
The general layout strategy is to first lay down the invariant region of the derived class and then build up the shared region. However, one problem remains: How is the implementation to gain access to the shared region of the class? In the original cfront implementation, a pointer to each virtual base class is inserted within each derived class object. Access of the inherited virtual base class members is achieved indirectly through the associated pointer.
There are two general solutions to the first problem. Microsoft's compiler introduced the virtual base class table. Each class object with one or more virtual base classes has a pointer to the virtual base class table inserted within it. The actual virtual base class pointers, of course, are placed within the table. Although this solution has been around for many years, I am not aware of any other compiler implementation that employs it. (It may be that Microsoft's patenting of their virtual function implementation effectively prohibits its use.)
The second solution, and the one preferred by Bjarne (at least while I was working on the Foundation project with him), is to place not the address but the offset of the virtual base class within the virtual function table.

Pointer to Data Members

Pointers to data members are a somewhat arcane but useful feature of the language, particularly if you need to probe at the underlying member layout of a class. One example of such a probing might be to determine if the vptr is placed at the beginning or end of the class. A second use, presented in Section, might be to determine the ordering of access sections within the class. As I said, it's an arcane, although potentially useful, language feature.

From: <<Inside the C++ Object Model>>

C++ Knowledge series 3的更多相关文章

Java Knowledge series 4
JVM & Bytecode Has-a or Is-a relationship(inheritance or composition) 如果想利用新类内部一个现有类的特性,而不想使用它的接 ...
C++ Knowledge series 1
Programming language evolves always along with Compiler's evolvement. 1. The C++ Object Model: Strou ...
C++ Knowledge series Template & Class
Function Function is composed of name, parameter (operand, type of operand), return value, body with ...
C++ Knowledge series Inheritance & RTTI & Exception Handling
Inheritance The pointer or reference to base class can address/be assigned with any of the classes d ...
C++ Knowledge series Conversion & Constructor & Destructor
Everything has its lifecycle, from being created to disappearing. Pass by reference instead of pass ...
C++ Knowledge series STL & Const
Thank to the pepole who devote theirself to the common libs. STL(http://www.cplusplus.com/reference/ ...
Java Knowledge series 7
Pepole who make a greate contribution on common libaraies deserve our respect. Component(Widget) / S ...
C++ Knowledge series 2
Programming language evolves always along with Compiler's evolvement The semantics of constructors O ...
Java Knowledge series 5
Interface from user, not from implementor.(DIP) Interface-Oriented Programming. Interface or Abstrac ...
Java Knowledge series 3
JVM & Bytecode Abstract & Object Object in Java (1) 所有东西都是对象object.可将对象想象成一种新型变量:它保存着数据,但可要求 ...

随机推荐

老男孩Day4作业：员工信息查询系统
1.作业需求: (1).工信息表程序,实现增删改查操作: (2).可进行模糊查询,语法至少支持下面3种: select name,age from staff_table where ...
模板 Trie树
模板 Trie树 code: #include <iostream> #include <cstdio> using namespace std; const int wx=2 ...
【三支火把】--- shell脚本中变量的类型及作用域
一直对shell脚本有一种特殊的感觉,因此花了一段时间学习,本人擅长C语言编程,深受C语言荼毒,在学习其他任何类似于编程语言的东东的时候,都会不自觉的与C进行对比,因此对于shell中的变量的作用域一 ...
win10系统重装
问题描述 win10开启热点网卡坏了,没折腾好.然后把系统网卡折腾坏了. 所以重装了系统,写下我的环境从零到晚上的过程 1安装系统用WePE安装win10,镜像采用:cn_windows_10_en ...
liunx一次安装多个软件包
https://blog.csdn.net/finded/article/details/44955953 编写shell脚本程序一次安装多个软件,主要用于一些软件依赖环境配置. 1.shell脚本 ...
powdesigner建表
默认打开powerDesigner时,创建table对应的自动生成sql语句没有注释. 方法1.comment注释信息在Columns标签下,一排按钮中找到倒数第2个按钮:Customize Col ...
[PowerShell]template engine
今天讨论的是如何在Powershell里实现一个简单的Template Engine的功能. 假设模板文件的内容如下:template.tt hello $name welcome $company ...
[Leetcode]013. Roman to Integer
public class Solution { public int romanToInt(String s) { if(s == null || s.length() == 0) return 0; ...
JTAG与JLink说明
JTAG接口解读通常所说的JTAG大致分两类,一类用于测试芯片的电气特性,检测芯片是否有问题:一类用于Debug:一般支持JTAG的CPU内都包含了这两个模块. 一个含有JTAG Debug接口模块 ...
java中Runtime类和Process类的简单介绍
在java.lang包当中定义了一个Runtime类,在java中对于Runtime类的定义如下: Java code public class Runtime extends Object 每个 J ...

C++ Knowledge series 3

Programming language evolves always along with Compiler's evolvement

C++ Knowledge series 3的更多相关文章

随机推荐

热门专题