The compilation process consists of four key steps, as shown in the image below. In this series of blog posts, we’ll explore each of those steps, highlighting the most important aspects and sharing interesting insights along the way. 😊

Preprocessing (c/cpp -> .i)

The first phase is preprocessing. The input is a .c or .cpp file, and the output is a .i file. Let’s dive into what happens during this stage.
To illustrate, we’ll use the following example:

Example 1

foo.h
// This is a comment in header file
int foo(void);

main.c
#include "foo.h"

// We have a macro
#define SQUARE(x) (x * x)

#define PI 3.14

/* Our main function
 * with multi-line comment
 */
int main() {
  int result = foo() + SQUARE(3);
  float area = PI * 5 * 5;
  return result + area;
}

Note: The compilation process can be stopped at any step. In this case, we want to stop it after preprocessing, and to do that, we use the -E flag.
gcc -E main.c -o main.i

The result:

main.i
# 1 "main.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 418 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "main.c" 2
# 1 "./foo.h" 1


int foo(void);
# 2 "main.c" 2
# 11 "main.c"
int main() {
  int result = foo() + (3 * 3);
  float area = 3.14 * 5 * 5;
  return result + area;
}

Let's observe the result:

  1. Header File Inclusion - The preprocessor replaces #include directives with the actual contents of the specified header files.
  2. Line Control Statements (# Directives)
  3. Comments are remove ( single and multi line)
  4. Macro Expansion (#define) - The preprocessor replaces macros with their defined values.
    • int result = foo() + SQUARE(3); -> int result = foo() + (3 * 3);
    • float area = PI * 5 * 5; -> float area = 3.14 * 5 * 5;

Example 2

#ifdef DEBUG
#include <stdio.h>
#endif

int main() {
    #ifdef DEBUG
    printf("Debug mode is enabled.\n");
    #endif
    return 0;
}
  1. Conditional Compilation (#ifdef, #ifndef, #endif)

Since DEBUG was not defined, the output after preprocessing looks like this:

...

int main() {
  return 0;
}

A well-known application of conditional compilation is include guards. Consider this simplified example:

#include "foo.h"
#include "foo.h"
#include "foo.h"
#include "foo.h"

int main() { 
  return 0; 
}

Without include guards, the output after preprocessing might look something like this:

...
int foo(void);
# 2 "main_multi.c" 2
# 1 "./foo.h" 1


int foo(void);
# 3 "main_multi.c" 2
# 1 "./foo.h" 1


int foo(void);
# 4 "main_multi.c" 2
# 1 "./foo.h" 1


int foo(void);
# 5 "main_multi.c" 2

int main() {
  return 0;
}

Of course, this is a silly example, but in larger projects, this issue can happen unintentionally. For example:

#include "foo.h"
#include "bar.h"
#include "tar.h"

If both bar.h and tar.h also include foo.h, the compiler processes foo.h multiple times, which can lead to redefinition errors.

To prevent multiple inclusions, we use include guards like this:

#ifndef FOO_H
#define FOO_H

int foo(void);

#endif // FOO_H

Example 3

#include <stdio.h>

#define CONCAT(a, b) a##b

int main() {
  int CONCAT(my, Var) = 10; // Expands to int myVar = 10;
  printf("File: %s, Line: %d\n", __FILE__, __LINE__);

  return 0;
}

The result:

// Ignored the stdio.h

int main() {
  int myVar = 10;
  printf("File: %s, Line: %d\n", "main.c", 7);

  return 0;
}
  1. Token Pasting (##) and Stringizing (#)
  2. Including Built-in Macros in our example __FILE__ and __LINE__

The next step in the process is compilation. Stay tuned for the next post!

Author Of article : rndthts.dev Read full article