How to make Xcode 8 C preprocessor ignore // comments in #defines

The C preprocessor ( cpp ) seems like it should handle this code correctly:

 #define A 1 // hello there int foo[A]; 

I would expect to replace A with 1 .

It happens that A is replaced by 1 // hello there , which leads to the following output from cpp -std=c99 test.c :

 # 1 "test.c" int foo[1 // hello there]; 

Which is invalid C and cannot compile.

How can I get cpp to perform the correct replacement?

Compiler note: using cpp from the latest (8.2.1, December 2016) Xcode on Mac, so I doubt it because of the outdated compiler.

+5
source share
3 answers

Most likely, I can reproduce the problem on my Mac (macOS Sierra 10.12.2; Apple LLVM version 8.0.0 (clang-800.0.42.1) ) using /usr/bin/cpp , which is Xcode cpp but does not use GNU cpp (which I invoke using only cpp ).

Workarounds:

 /usr/bin/gcc -E -std=c99 test.c 

In this case, clang wrapper gcc to start the C preprocessor and correctly processes the version. You can add the -v option and see what it launches; I have not seen him execute cpp per se (he runs clang -cc1 -E with a lot of other information).

You can also use:

 clang -E -std=c99 test.c 

It is actually the same.

You can also install GCC and use this instead of Xcode. There are questions with answers on how to do this (but this is not for the faint of heart).

+5
source

Note that // not a valid C90 comment. It was introduced in C99, so make sure your compiler and preprocessor know that they must use the C99 standard. In many cases, -std=c99 . (The question has been edited to make it clear)


Further, I don’t think the preprocessor cares about the comments. From the 6.10 specification of the C99, the grammar of the preprocessor directives is shown, and nowhere does it mention comments ...

The ANSI C standard clearly states that comments should be replaced in 2.1.1.2 phase 3 “Translation Phases” (5.1.1.2 to C99). (Figure from this other answer ).

  1. The source file is split into preprocessing tokens and a sequence of space characters (including comments). The source file should not end with a partial preprocessing marker or partial comment. Each comment is replaced by a single space character. Newline characters are saved. Regardless of whether each non-empty sequence of space characters other than a new line is preserved, or replaced by a single space, is determined by the implementation.

Older tools may not have been respected either because they preceded any C standard, or they had errors, or they interpreted the standard in different ways. They probably saved these bugs / quirks for backward compatibility. Testing with clang -E -std=c99 vs /usr/bin/cpp -std=c99 confirms this. They behave differently, despite the fact that the same compiler is under the hood.

 $ /usr/bin/cpp --version Apple LLVM version 8.0.0 (clang-800.0.42.1) Target: x86_64-apple-darwin16.3.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin $ clang --version Apple LLVM version 8.0.0 (clang-800.0.42.1) Target: x86_64-apple-darwin16.3.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin $ ls -l /usr/bin/cpp -rwxr-xr-x 1 root wheel 18240 Dec 10 01:04 /usr/bin/cpp $ ls -l /usr/bin/clang -rwxr-xr-x 1 root wheel 18240 Dec 10 01:04 /usr/bin/clang $ /usr/bin/cpp -std=c99 test.c # 1 "test.c" # 1 "<built-in>" 1 # 1 "<built-in>" 3 # 330 "<built-in>" 3 # 1 "<command line>" 1 # 1 "<built-in>" 2 # 1 "test.c" 2 int foo[1 // hello there]; $ /usr/bin/clang -E -std=c99 test.c # 1 "test.c" # 1 "<built-in>" 1 # 1 "<built-in>" 3 # 331 "<built-in>" 3 # 1 "<command line>" 1 # 1 "<built-in>" 2 # 1 "test.c" 2 int foo[1]; 

I suspect you are calling clang as /usr/bin/cpp , causing compatibility with the / quirk error with the original cpp behavior set back when the behavior is unclear.

I think the lesson is to use cc -E rather than cpp to ensure consistent behavior.

+2
source

From the C11 specification (highlighted by me):

5.1.1.2 Translation Phases

The priority among the syntax rules for translation is determined by the following phases 6) .

  • [...] multibyte characters are mapped [...] to the source character set [...] Trigraph sequences are replaced [...]

  • Each instance of the backslash character (), immediately followed by a newline character, is deleted by splicing the physical source lines [...]

  • The source file is split into preprocessing tokens and a sequence of space characters ( including comments ). [...] Each comment is replaced by one space. [...]

  • Preprocessor directives are executed, macro calls are expanded , and _Pragma statements are executed. [...]

where note 6):

Implementations should behave as if these separate phases were occurring, even though many of them are usually complex in practice. Source files, translation units, and translated translation units do not have to be stored as files, and no one-to-one correspondence between these entities and any external representation is required. The description is only conceptual and does not indicate any particular implementation.

Therefore, an implementation complying with the C11 specification does not require a separate preprocessor. This means that the cpp command can do whatever it wants. And the compiler driver is allowed to complete steps 1 through 3 in any way he wants. Thus, the correct way to get the result after preprocessing is to call the compiler driver using cc -E .

0
source

Source: https://habr.com/ru/post/1262668/


All Articles