Post

Getting Started with C: Creating an Executable

How to compile a program in C.

Getting Started with C: Creating an Executable

What is a compiler?

If I remember correctly, this was probably the first major hurdle hit while studying at university. Something about it just didn’t click for some reason, what am I doing? And, why are we writing commands into the terminal? Can’t we just push a button?

I had been used to interpreted languages like C# and Python. Which do not require compilation.

These languages require another application, called an interpreter, that translates the code into instructions your CPU can perform, working as a middleman. If you’re in the game development world, examples of this are: C# in Unity, Blueprints in Unreal and GD script in Godot.

Compiled languages, on the other hand, require a compiler to take the code you wrote, and translate it into a format your CPU can do something with before you can actually run your program.

While you can write code for games and other graphical applications in interpreted languages just fine. You do get a performance increase by using a compiled language like C or C++ due to not having to translate code at runtime. The CPU doesn’t have to do any extra work to make the code go, the information the it needs is already right there in the executable.

Trying not to get too in-depth on that topic, since it can get a bit overwhelming and doesn’t help us get into writing code, all we need to know is that we have to perform this additional step to make an application in C / C++.

Why run the compiler from the command line?

Some code editors will allow you to hit that run button I fantasised about in university.

However…

As painful as it might sound now if you’re not used to command line interfaces (CLIs), much like I was back then. I would advocate for at least learning the basics of how to run a compiler on the command line. Not necessarily all the specifics, but enough to know how to find the information you need.

The reason for this is that if you know how to invoke a compiler using the CLI, you can write a bash scripts that allow for automating the build process in a number of ways. While you may be able to do this in your code editor, you’ll be able to use these scripts anywhere. They will not be bound to a specific code editor.

For example, this website has a build.sh script, and while the compilation process is different, it allows me to run a build command, add changes to source control, and publish the site from just running one script.

These scripts can be setup once and forgotten about until you want to add another step to your build sequence.

What does a compiler actually do?

There are different compilers for C and C++ depending on what operating system you’re using. Most commonly, MSVC for Windows, and gcc/g++ or clang on Linux based systems. Though you can use the latter two on Windows as well.

All compilers follow a set of rules supplied by the language standard, which defines what is and isn’t valid code.

Some parts of the C and C++ standard are more loosely defined than others. Meaning some compilers may be fine with somethings that the others are not. Just something to watch out for if you’re working in a multi-platform team.

C and C++ compilers do this in three stages:

  • Pre-processing
  • Building
  • Linking

Pre-processing

This will become clearer later on in this series of posts, but during this stage the compiler goes over all the code that is not actual C code or instructions you can write in C files that tell the compiler what to do in the next step.

Building

This is the stage where we actually convert the C code into assembly for the CPU’s consumption. This will output what are referred to as an “intermediate” files, these are kind of a half-baked application. It’s all the instructions that the CPU needs to do, but no context for how they interconnect. In a lot of cases, you’ll probably only output one file here. But there can be cases where you want to output multiple intermediates for linking together later. This is where you generate any compile errors or warnings. Anything that isn’t valid code.

Linking

Linking takes all the intermediate files we generated in the last step and binds them together in an executable you can run. Making sure every function you described actually exists and can be called. This is also where any external code that’s pre-packaged in what’s referred to as a “library” is added to your executable. There’s a lot more finer details here but for now, a basic understanding of each stage is all you need to get started. This is where you may have “linking errors”, usually denoted by “[LINK]” before an error message in your compiler output. This usually means, the building step was able to see that the code you’re trying to call was declared somewhere, but not defined. We’ll be going into what that means later.

What are Compiler Arguments?

Compilers allow you to tell them how you want them to compile the code you give it.

This can be useful for enabling/disabling optimisations as well as a long list of very specific features for your usecase.

I won’t go into too much detail about it just now, but you can find these on the documentation web pages for the compiler you’re using. Usually easiest to search something like “msvc options” or “msvc compiler flags”. Where “msvc” is replaced with the compiler you’re using.

If you’re interested, you can find them here:

Thankfully there is a lot of overlap, especially between gcc/g++ and clang.

gcc and g++ have the same page for command line options but gcc is for C and g++ is for C++. In most cases g++ will still compile C code but there are some features in C that are incompatible.

Actually Compiling C

Ok that was a lot of talk about something we don’t even know how to use yet.

We’ll be using the command line to compile here, which unless you’re a linux elemental, might look a bit scary. Don’t worry it’s actually not too bad once you get the hang of it.

Source files

First you’ll need a source file (a file that contains code) to compile. You can make an empty file to write code in later. The standard naming convention is usually main.c or main.cpp for the file where your program starts, but really you can call this whatever you like as long as it ends in .c (for C) or .cpp (for C++).

It’s best to keep your .c and .cpp files in a directory called src or source to keep your project directory clean. E.g. our main.c will be in /project_path/src/main.c. You can subdivide these folders even further if you’re working on large projects that benefit from functionality being spread across files. e.g. /project_path/src/assets/png.c could be a source file with all the functions related to png loading inside.

Entry point

You’ll also need to add what’s called an “Entry Point”. This tells the compiler “hey, start here in this function”. We’ll go more into functions and entry points in later posts, but for now open the file you just created inside your preferred text editor. Then enter the following and save:

1
2
3
4
int main(int argc, char** argv)
{
    return 0;
}

Build script

Next we’re going to create a bash script, we won’t go too hard on the bashing, but making a script that runs the compile command is a lot more convenient than having to type it out every time.

Create a file called build.sh in your project directory. Then open it in your text editor.

Here we’re actually going to write the command we need to build the project.

Each compiler mentioned has a different command you need to run to make it go, but the basic format is pretty much the same.

1
command input_file.c

So our build command will look like…

MSVC:

1
cl src/main.c

gcc:

1
2
3
4
#!/bin/sh
# ^ on linux add this to the top of the file, as it tells your system this is a bash file

gcc src/main.c

clang:

1
2
3
4
#!/bin/sh
# ^ on linux add this to the top of the file, as it tells your system this is a bash file

clang src/main.c

Type the relevant command into your build.sh file, save and run the script by double clicking it or by opening your command line in the project directory, and enter the command ./build.sh

Congrats! You just compiled your first C application.

For MSVC you’ll need to run the build.sh script from the command line, running the command vcvarsall [cpu architecture] before you do so, for example, my system is 64 bit, so my command would be vcvarsall x64. This is due to MSVC needing to setup environment variables in order to access the correct cl command. These variables are only set for the command line window that vcvarsall was ran in, once the window is closed you’ll need to run it again to compile. While you could add vcvarsall x64 to your build.sh script, in my experience, it’s usually best to run this once in the command line then call build.sh in the same window as this command takes a few seconds to run, which you don’t want to have to do every time you build.

Depending on how you installed your compiler, you may need to add something to your PATH variable on your operating system. If you’re on linux and installed through your package manager, the command should be available to you. But on windows, you may need to add it manually so the command is accessible in any directory. The easiest way to do this is to hit the windows key and type “PATH” into the search box, there should be an option named something along the lines of “PATH” and or “environment variables”. Select this option, find the “PATH” variable, click “Edit”, add a semi-colon to the end of the existing text, then add the path to where your compiler is installed. My MSVC path is: c:/Program Files/Microsoft Visual Studio/2022/Community/VC/Auxiliary/Build. This folder contains vcvarsall.bat which sets up command line to be able to run the compile commands.

Depending on your code editor, you may want to pass an absolute path into the compiler rather than a relative one. For example the windows path C:/dev/project/src/main.c rather than src/main.c At least in my case, if I use relative paths, and select an error in the console to jump to it. It will create a temporary copy of the file file instead of opening the actual file itself.

For Linux, remember to call chmod 777 script-path where script-path is repaced by the path to build.sh. Otherwise you will not be able to run your script.

That was a lot of info blocks wasn’t it…

Changing the output directory

By default, the compiler will spit out your executable in the directory the command was called from. It’s usually best to make a separate directory called build to store all your output files in. This makes it easier to delete all built files, and ignore this directory if you’re using source control.

To output to this directory we can make the following changes to our build.sh:

MSVC:

1
2
3
4
5
6
7
8
9
10
# makes directory, adding -p will only create the directory, 
# if it doesn't already exist.
mkdir -p ./build

# change directory to the build directory
cd ./build

# ../ prefix to go "up" one directory, as we're
# no longer in the project root
cl ../src/main.c

gcc:

1
2
3
4
5
6
7
8
9
10
# makes directory, adding -p will only create the directory, 
# if it doesn't already exist.
mkdir -p ./build

# change directory to the build directory
cd ./build

# ../ prefix to go "up" one directory, as we're
# no longer in the project root
gcc ../src/main.c 

clang:

1
2
3
4
5
6
7
8
9
10
# makes directory, adding -p will only create the directory, 
# if it doesn't already exist.
mkdir -p ./build

# change directory to the build directory
cd ./build

# ../ prefix to  "up" one directory, as we're
# no longer in the project root
clang ../src/main.c

# in bash is a comment, anything following this will not be considered when running the bash script.

You can also specify the output directory along with the output file name by using compiler options, however this only changes the directory that the final executable will be found in. In my experience, I’ve found it’s usually just easiest to change directory to the build folder and then run the compile command from there.

Executable Name

You may also want to change the name of the executable you’ve built, while you can just rename it yourself every time you build. There are compiler options to name the output after it’s compiled to make things easier.

MSVC:

1
2
3
4
5
6
mkdir -p ./build
cd ./build

# -link tells the compiler know you're talking about the linking stage, 
# -out: tells it to output the final result under the name provided
cl ../src/main.c -link -out:main.exe

gcc:

1
2
3
4
5
mkdir -p ./build
cd ./build

# -o: tells the compiler to output the executable using the name provided
gcc ../src/main.c -o main.exe

clang:

1
2
3
4
5
mkdir -p ./build
cd ./build

# -o: tells the compiler to output the executable using the name provided
clang ../src/main.c -o main.exe

As said in the last section, you can also specify a directory with your output, to do this just prefix the directory to the text after the output option. E.g. for gcc: gcc ../src/main.c -o main/main.exe. Will output into the “main” directory if it exists, creating a file called main.exe within.

Linux executables need not have the .exe extension, while windows requires extensions to know what to do with the file, linux will know what to do with the file based on it’s contents. However, it doesn’t hurt to add an extension if you’d like.

Debug info

Just, one more thing…

By default the compiler will optimise your code, which is a great when you’re shipping. But for development, this would be painful. Not having debug info means our debuggers won’t know much about the program. So when we get a error, or want to step through the code to see what it’s doing, we’ll likely not get all the info we’re looking for.

To generate debug info, we need to turn compiler optimisations off.

To do this we need to change our build.sh script to the following:

MSVC:

1
2
3
4
5
mkdir -p ./build 
cd ./build

# -Zi tells msvc to output debug info
cl ../src/main.c -Zi -link -out:main.exe

gcc:

1
2
3
4
5
mkdir -p ./build
cd ./build

# -O0 (capital o followed by zero) stops gcc from optimising
gcc ../src/main.c -O0 -o main.exe

clang:

1
2
3
4
5
mkdir -p ./build
cd ./build

# -O0 (capital o followed by zero) stops clang from optimising
clang ../src/main.c -O0 -o main.exe

For MSVC, additional files ending in .pdb should be created in your build directory if this is successful. This contains all the debug info required for the executable. Using clang and gcc, no additional files will be generated.

That’s it! You’re finally setup and ready to write code.

While this isn’t totally all there is to know here, it’s enough to output an executable that you can actually run on your computer. Which is pretty cool if you ask me.

Next we’ll be going into how to actually write C / C++ code

Next up: Variables

This post is licensed under CC BY 4.0 by the author.