Tuesday, June 5, 2007

C++ : Multiple Access Specifiers in a Class

On May 27th, Ramshankar in one of the C++ communities at Orkut, asked what seemed like a pretty innocuous question:

class TUid
{
public:
IMPORT_C TInt operator==(const TUid& aUid) const;
static inline TUid Null();
public:
TInt32 iUid;
};


What is the purpose of defining "public" section again? Is it for allowing:

TUid myType = { 0x01232423 };

Multiple public/protected/private sections are very much allowed in C++. In fact, they are seen in MFC wizard generated code. But, the real problem lay in not whether it was allowed, but why has this been allowed? As the C++ standard states that:

12. Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. The order of allocation of nonstatic data members separated by an access-specifier is unspecified (_class.access.spec_).

So, to find accurate (reliable) answer I had to email Bjarne, and this was his reply:

Consider

struct S {
public:
int a;
private:
int b;
public:
int c;
private:
int d;
};

Is the compiler allowed to allocate the private members next to each other? (the answer is yes).

The reason for the rule was early ideas of separating private data from public data for some implementations to be able to alleviate code evolution problems when the data layout changed.
For example if you allocated public before private, then adding a private member could be done without affecting the public intercase (after creation). As far as I know, no compiler has ever done that.

However, some compilers do use rearrangement to create more compact layouts. For example:

struct SS {
char a;
public:
int b;
public:
char c;
public:
int d;
public:
char e;
};

If members are allocated in declaration order, the size will be 5 words, but you can (legally) reorder to get 3 words (assuming a 4-byte word).

Personally, I have never found this useful.

I'll leave as an exercise how to reorder to get 3 words . <(^_^)>

But, a mystery still remains as André asked:

Can you write any piece of strictly conforming code for which 9p12 (the above stated standard snippet) makes ANY difference for a non-POD (Plain Old Data) type?

Put another way (this is a different formulation of the same question):
Can you write any piece of code that uses 9p12 (the fact that the order is specified) for a non-POD type, without invoking undefined behaviour?

Can you?!

14 comments:

Ram said...

But, the real problem lay in not whether it was allowed, but why has this been allowed?

There is one reason where using interleaved access specifiers maybe necessary.

The vtable order for public/protected functions is set in stone.

Consider:

class FBCClass {
public:
...
virtual int myFunc();
private:
virtual void _Reserved2();
virtual void _Reserved3();
...
};

Now without changing vtable order we can simple pull _Reserved2() into public and use it without changing the physical order.

But consider what happens when we have/need protected virtuals.

In such a case, in-order to make room for future public virtuals without breaking binary compatibility you will need to interleave sections like:

class FBCClass {
public:
...
protected:
...
virtual int myFuncProt();

public:
virtual int NewFunc ();

private:
virtual void _Reserved2();
virtual void _Reserved3();
...
};

This is one use for such interleaving of sections especially because protected virtuals order is also set in stone.

Hence this should answer the "why has this been allowed" question. I cannot think of anyother use for interleaving sections.

Zaman Bakshi said...

No. The object model design has nothing to do with the implementation. How you implement a particular characteristic of C++ doesn't (should not) change the behaviour. If it does, it is considered a non-conforming (to the standard) implementation. A user has nothing to do with the virtual table and implementation has to take that into consideration.

Zaman Bakshi said...

Moreover, what you say about rigidity in vtables can be true (not that any I can think of) for a specific compiler (implementation). Vtables are always rigid, or else, overloading will not work. One more thing, it is not necessary that you implement polymorphic behaviour only using virtual tables, one can very well think of some other data structure or method. Yes, the efficiency of that method may/may not be better.

Ram said...

Are you implying that the behaviour of using declarative order in the binary is not consistent across compilers?

I cannot think of any compiler that produces vtables in any other order (say alphabetically)...

Ram said...

Using alternative methods to vtables might be one solution, but the pros and cons of this i'm not sure.

But if we were to avoid the FBC problem and still use virtual we must rely on the compiler's careful ordering of the vtable for the objects.

Zaman Bakshi said...

>>>Are you implying that the behaviour of using declarative order in the binary is not consistent across compilers?<<<

What I am applying is that a language a its programming rules must be independent of their implementation, and, so has C++ been written in that way. What Bjarne pointed out is that, as 'some' earlier systems had already made a grave mistake of making themselves non-conforming (using code-evolution methodology instead of abstract programming idiom), the standard committee had no other choice but to mention this section. If all had been OK, this section wouldn't have existed.

Moreover, yes the order and the way of representing vtables is not consistent across compilers. Had they been, COM wouldn't have existed at all. Most of the companies have patented their data structures, hence, newer implementations have to use some other methods. cfront, Microsoft, IBM, all use different formats and techniques. Just check for the patents online.

>>>>>>>But if we were to avoid the FBC problem and still use virtual we must rely on the compiler's careful ordering of the vtable for the objects.<<<<<<<<<<

Yes, for a poor implementation. A code on conforming-implementation would be independent of this (defect).

Ram said...

>> Yes, for a poor implementation. A code on conforming-implementation would be independent of this (defect).

How would a code on conforming-implementation then avoid FBC without considering a specific vtable order? (Apart from using home-grown virtual behaviour, i.e. by using virtual itself).

Zaman Bakshi said...

Functions (or, vtbls if used) are not part of object's image, and the standard places no condition where vptr (which, if used is a part of object's image) should be.

If you consider an example of data members and your concern is exactly valid. This is what Bjarne meant with 'code evolution'.

But, current implementations (most that I know of) don't use this notion. Rather, once defined you are not supposed to change the order of members in base class declaration and compile again. This will break the code.

class FBCClass {

public:
// ...

private:
int x;
char* y;
some_type p;

};

Now, if we add new sections to this

class FBCClass {

public:
// ...

private:
int x;
char* y;
some_type p;

private: //or public, whatever
some_other_type q;

};

or, in this way

class FBCClass {

private: //or public, whatever
some_other_type q;

public:
// ...

private:
int x;
char* y;
some_type p;

};

The new code will definitely break old system configuration. It is the user who has broken it and not the implementation. There are two choices:

1. Either you know how an implementation works and use it to your advantage. In which case code may not break (Code-evolution strategy). You are coding implementation specific.

2. Or, you understand that every time you reorder or add elements you have to recompile the whole system.

There is no third way. The point of question (within the community) was that if we follow either way the concerned section of the standard is useless. This is why Bjarne says he has never found it useful/fruitful.

Can you provide a way to make your statement valid using data members?

Zaman Bakshi said...

In either of the stated examples the system uses (mostly) first come first allocation within the object's image. Though, it may use reordering (to decrease memory usage and request time) when compiled for a certain architecture. So, in either case the original image is broken. Coding for a certain implementation specific behaviour by keeping a certain code-evolution strategy in mind is undefined-behaviour.

Can you give any example in which the standard snippet is valid without rendering your code undefined?

Ram said...

> There is no third way.

So going implementation specific (like the example I showed) is the only known solution (though non-portable and UB) to get around FBC right?

I don't take the "recompile entire OS each time" strategy to really be a solution :P

>> Can you give any example in which the standard snippet is valid without rendering your code undefined?

Not that I can think of. Without re-ordering the object code would break.

One possible but really ugly method to alleviate is using an "ioctl" style method like:

virtual void PerformFunc (int _funcCode, void* _parameters);

This would switch and perform functions according to _funcCode, but its not as efficient as having proper regular virtuals.

Zaman Bakshi said...

[i]So going implementation specific (like the example I showed) is the only known solution (though non-portable and UB) to get around FBC right?[/i]

Exactly. This was the whole issue. Well, you know sometimes (earlier, not now, I think) the standards committee has to give in for the demands of specific implementers. After all, C++ has C compatibility due to this specific reason. Given choice, Bjarne wouldn't have included C-compatibility (as he points out in the D&E) within C++.

Ram said...

I see.

But why can't the committee say that vtable order SHOULD be in declarative order?

It makes no sense for a compiler to reorder it in alphabetical or any other order as only if it's in declarative order we can work around the FBC problem like shown above...??

Zaman Bakshi said...

>>>>I see.

But why can't the committee say that vtable order SHOULD be in declarative order?

It makes no sense for a compiler to reorder it in alphabetical or any other order as only if it's in declarative order we can work around the FBC problem like shown above...??<<<

Because virtual tables are implementation issue and one of many ways to implement virtual mechanism. Standard does comment on how to implement a certain feature, but yes, sometimes does provides constraints. The term virtual table, vtbl, vtable, virtual pointer etc are not used by the standard. It is know to the programming world as a way to implement virtual mechanism.

Anonymous said...

What a great web log. I spend hours on the net reading blogs, about tons of various subjects. I have to first of all give praise to whoever created your theme and second of all to you for writing what i can only describe as an fabulous article. I honestly believe there is a skill to writing articles that only very few posses and honestly you got it. The combining of demonstrative and upper-class content is by all odds super rare with the astronomic amount of blogs on the cyberspace.