CPP extern "C"

February 4, 2014

When is extern "C" {...} necessary ?

TLDR;

When some C code calls some functions declared/defined in C++ or inversely.

Details

extern "C" {...} makes a function-name in C++ have 'C' linkage. This means the C++ compiler does not mangle the function name. C code can then link to the function using a 'C' compatible header file that contains just the declaration of the function. The function definition is contained in a binary format (that was compiled by a C++ compiler) that the client 'C' linker will then link to using the 'C' name.

Foo.h

void foo( void );

Foo.c

#include "Foo.h"

void foo( void ) {}

Bar.cpp

#include "Foo.h" 

void bar() {
    foo();
}

Compiling this code with clang -O0 Bar.cpp Foo.c yields error:

Undefined symbols for architecture x86_64:
  "foo()", referenced from:
      _main in Bar-NdopIN.o
ld: symbol(s) not found for architecture x86_64

Swaping Bar.cpp with the following implementation fixes the problem:

Bar.cpp

#ifdef __cplusplus
extern "C" {
#endif

#include "Foo.h" 

#ifdef __cplusplus
}
#endif

int main() {
    foo();
    return 0;
}

Why is extern "C" {...} necessary ?

Here is the interesting bit! It comes to down to C vs C++ linkage. Linkage is a process which allow symbols (function identifier for instance) to have unique names (name mangling) and to resolve symbols.

Name unicity is required in (list not exhasutive):

C and C++ name mangling are different

Assuming Bar.cpp has no extern "C" {#include "Foo.h"}, then the compiler/linker will respectively generate and see the following names:

Function identifier C Name Mangling [1] C++ Name Mangling [2]
void foo( void) _foo[3] __Z3foov[3]

[1] clang -c -O0 Foo.c [2] clang++ -c -O0 Foo.c [3] nm Foo.o

Hence when resolving symbols the linker is not able to say _foo in Foo.o and _Z3foov in Foo.o are actually refering to the same function identifier void foo( void )

That's why extern "C" {...} (C++, 7.5) is required. It will tell the C++ compiler/linker that all function identifiers within extern "C"'s brackets are to be generated/resolved using "C" Name Mangling.

Assuming Bar.cpp has now extern "C" {#include "Foo.h"}, then the compiler/linker will respectively generate and see the following names:

Function identifier C Name Mangling [1] C++ Name Mangling [2]
void foo( void) _foo[3] _foo[3]

[1] clang -c -O0 Foo.c [2] clang++ -c -O0 Foo.c [3] nm Foo.o

Tada! Both symbols are the sames, the linker can then resolve the symbol _foo used in Bar.o and defined in Foo.o and therefore links successfully!

Discussion, links, and tweets

I'm a developer at IO Stark.