From source code to executable
Now that you’ve seen how
map works, let’s take a dive into how we went from high-level C code to an executable.
There are 10 written questions for this section, and you must submit your responses to these questions on Gradescope.
Before we start, we’ll be using a few compiler flags which are likely new to you. Here’s a summary of the flags we’ll be using.
-Wall– Enables all compiler warnings
-m32– Compiles the code for the i386 architecture. Please leave this flag out on instructional machines.
-S– Invokes the COMPILER only.
-c– Invokes the COMPILER and ASSEMBLER only.
Let’s start by invoking the compiler. The compiler takes high-level C code and produces a variant of x86 known as 8086 or i386 assembly.
gcc -m32 -S -o map.S map.c
This will only invoke the compiler for
map.c and output the assembly code in
recurse.Sand find which instruction(s) corresponds to the recursive call of
recur(i - 1).
Now we will assemble our compiled code into an executable. To assemble our code we can run:
gcc -m32 -c map.S -o map.o
This turns our raw x86 code (
map.S) into machine code or an object file (
We can also combine these steps by just running
gcc -m32 -c on our C file directly. We can run:
gcc -m32 -c recurse.c -o recurse.o
The assembler converts the raw assembly code into an object file which contains code as well as other data and metadata necessary for execution. Different operating systems use different types of object files. In this class, we will be using ELF (Executable and Linkable Format), the object format used by Linux. Let’s start by taking a look at
recurse.o. These are object files, so we will use the
objdump program to read them.
objdump -D map.o objdump -D recurse.o
- What do the
The assembler generates a symbol table which is part of the object file. The symbol table contains all the symbols that can be globally referenced (referenced outside the object file) from another object file (i.e. global/static variables and functions).
- What command do we use to view the symbols in an ELF file? (Hint: We can use
objdumpagain, look at
man objdumpto find the right flag).
Here’s an excerpt from the map.o symbol table:
00000000 g O .data 00000004 stuff 00000000 g F .text 00000060 main ... 00000000 *UND* 00000000 malloc 00000000 *UND* 00000000 recur
What do the
Where else can we find a symbol for
recur? Which file is this in? Copy and paste the relevant portion of the symbol table.
Finally, let’s link our 2 object files to create an executable.
gcc -m32 map.o recurse.o -o map
Note that we could’ve just called
gcc -m32 map.c recurse.c -o map on the C files to do this entire process in a single command. Often times build systems will separate these commands in order to speed up compile times (since only the changed files need to be recompiled).
- Examine the symbol table of the entire
mapprogram now. What has changed?
objdump can be used to look at more than just the symbol table—it can show us the structure of the executable. Run
objdump -x -d map. You will see that your program has several segments, names of functions and variables in your program correspond to labels with addresses or values. The guts of everything is chunks of stuff within segments.
objdump output these segments are under the section heading. There’s actually a slight nuance between these two terms which you can read more about online.
Using the output of
objdump, answer the following questions:
What segment(s)/section(s) contains
recur(the function)? (The address of
objdumpwill not be exactly the same as what you saw in gdb. An optional stretch exercise is to think about why. Hint: See the Wikipedia article on relocation.)
What segment(s)/section(s) contains global variables? Hint: look for the variables
Do you see the stack segment anywhere? What about the heap? Explain.
Based on the output of
map, in which direction does the stack grow? Explain.