Sunday, December 3, 2006

C/C++ : Speed Variation

People believe that C code has faster execution speed than of C++. I argue otherwise. Many C++ gurus have spent their valuable time explaining that there's no difference. Now, I may lack credibility but the writer of C++, Bjarne Stroustrup, certainly doesn't. So, would you believe his words? While browsing the Internet, I found myself laying hands on one of the emails sent to Bajrne to explain his views on this contentious topic. Here's the email (in full):


TITLE: Speed and size of C versus C++


PROBLEM: ???

I also heard the size of C++ program is generally bigger than C. I am a C programmer trying to learn C++. So, don't blame me if I have created any misconception about C++.


RESPONSE: ???

For one minor detail: including printf/scanf can include more code than is actually used. An intelligent C++ linker will only get that parts of the stream library REALLY needed.

RESPONSE: cshaver@informix.com (Craig Shaver @ Informix Software, Inc.)

Who implements an intelligent C++ linker??? I was under the impression that you get all the functionality in a class when you link, whether you use it or not.


RESPONSE: bs@alice.att.com (Bjarne Stroustrup), 12 Jan 93
AT&T Bell Laboratories, Murray Hill NJ

Let me try to clear up one or two points. Consider first a somewhat minimal C and C++ program x1.c:

main()
{
int i;
for (i = 0; i < 1000000; i++ ) printf("Hi,mom!\n") ;
}

and its more C++ looking cousin x2.c:

main()
{
int i;
for (i = 0; i<1000000; i++ ) cout << "Hi,mon!\n" ;
}

I compiled and ran x1.c and x2.c:

c: cc x1.c
c: size a.out
text | data | bss | dec | hex
12288| 6144| 7608| 26040| 65b8

c: time a.out > /dev/null
25.5u 0.5s 37r a.out

c: PTCC x1.c
c: size a.out
text |data | bss | dec | hex
12288 | 6144 | 7620 | 26052 | 65c4

c: time a.out > /dev/null
25.7u 0.4s 33r a.out

PTCC is the driver for my standard off the-shelf Cfront 3.0 (i.e. I'm not using any technology you couldn't buy half a year ago). Note that the size of the generated code is essentially the same. So is the speed. Running these examples a few times to eliminate random error in the timing mechanism shows that the run-time isn't biased one way or the other.

This is what you should expect for a program in the common sub-set of C and C++. There is fundamental reasons for that. You should expect identical code from two C and C++ compilers using the same technology. The only possibly SYSTEMATIC difference I can think of is that a C++ compiler can use better function call sequences than a C compiler that doesn't apply global optimization because in many cases a C compiler must guard against possible calls with differing numbers of arguments where a C++ compiler doesn't need to because of C++'s stronger type checking. In most C and C++ compilers, this difference is theoretical, but I'm told that in Zortech C++ it is real (i.e. C++ programs are ever so slightly faster than their C equivalents). However, this is all noise, I doubt the difference between C and C++ in this kind of comparison matters to any real programmers. The difference is far smaller than differences between different C compilers - but surprisingly, it is in C++'s favor.

Programs in the common subset of C and C++ results in
equal sized code that execute at equal speed.

If that conclusion doesn't appear to hold, check if your C and C++ compilers are of similar quality. If your C++ compiler appears to loose badly you have the option of using a Cfront variant to get the benefits of your C compiler's code generation facilities. If your C compiler loose badly, switch to C++ even if you aren't ready to use the ``++ features.''

Now, a common argument is ``OK, so C++ can match C for a C programs but as soon as you use the REAL C++ features your programs get bigger and slower.'' Clearly you can write big and slow programs in any language (even C), but you don't necessarily take a performance hit when you start using C++. Consider x2.c. It uses the C++ stream I/O library that is certainly bigger than C's stdio and is unlikely to be tuned to the same degree as stdio. It is also a library that
uses a very large sub-set of C++'s features in its interface and implementation (operator overloading, multiple inheritance, virtual functions, etc.):

c: PTCC x2.c
c: size a.out
c: text | data | bss | dec | hex
17408 | 2048 | 0 |19456| 4c00
c: time a.out > /dev/null
32.8u 1.0s 43r a.out

Surprisingly enough, the code generated for x2.c is noticeably smaller than the code generated for x1.c (75% of x1.o) though - as expected it runs a bit slower (29% user cpu time, 16% better elapse time).

I claim, but cannot prove, that the run-time overhead is primarily a difference in tuning. Other programs that rely heavily on C++ features show improvements over their C counterparts - and others again show overhead. The differences does not appear systematic to me; that is, they are differences in design and effort, rather than inherent overhead in C or C++.

The space advantage of the C++ program is an advantage of the same kind; that is, it is there because a little extra care and thought was spent. Other implementations of stream I/O will show different space and time usage, as will different implementations of stdio. To do simple things only the essential parts of the stream I/O library is brought in. You don't actually need a very ``intelligent'' linker, the dumb old Unix ld will do: Just manually split your implementation into several .c files. A simple example:

X.h:
class X {
// details
public:
void f(); // common function
void g(); // uncommon function
// more functions
};


X1.c:
// common functions:

void X::f() { ... }


X2.c:
// uncommon functions:

void X::g() { ... }

Now, any half-way decent archive program can bring in the object code for X1.c (only) for programs that use the common functions (only) and leave the expense of bringing in the object code for X2.c for the programs that actually use functions defined in X2.c.

There exist linkers that can do that without human help (mostly in the PC world), I just happen not to have one. I think it is important to note that this technique and the tools that supports it carried over from C to C++. We wasn't at the mercy of some ``smart'' and possibly espensive or unavailable technology. We don't have to forget or loose all of our effective techniques in moving from C to C++. We should - as ever - use them with a suitable amount of judgement.

C++ was designed not to leave room "below'' for a lower level language, except assembler for machine specific operations.

I found this email on this following link:
http://nkari.uw.hu/Tutorials/CPPTips/split_impl

You can click on it to check for any inconsistencies. So, I take his words on this issue, do you?