From source code to executable
Now that you’ve seen how map
works, let’s take a dive into how we went from high-level C code to an executable.
There are 10 written questions for this section, and you must submit your responses to these questions on Gradescope.
Before we start, we’ll be using a few compiler flags which are likely new to you. Here’s a summary of the flags we’ll be using.
-Wall
– Enables all compiler warnings-m32
– Compiles the code for the i386 architecture.-E
- Invokes the PREPROCESSOR only.-S
– Invokes the COMPILER only.-c
– Invokes the COMPILER and ASSEMBLER only.
Important: Please use i386-gcc
instead of gcc
for this homework.
Let’s now invoke the compiler. The compiler takes high-level C code and produces a variant of x86 known as 8086 or i386 assembly.
To compile map.c
, run:
i386-gcc -m32 -S -o map.S map.c
This will only invoke the compiler for map.c
and output the assembly code in map.S
.
- Generate
recurse.S
and find which instructions correspond to the recursive call ofrecur(i - 1)
.
Now we will assemble our compiled code into an executable. To assemble our code we can run:
i386-gcc -m32 -c map.S -o map.o
This turns our raw x86 code (map.S
) into machine code or an object file (map.o
).
We can also combine these steps by just running i386-gcc -m32 -c
on our C file directly. We can run:
i386-gcc -m32 -c recurse.c -o recurse.o
The assembler converts the raw assembly code into an object file which contains code as well as other data and metadata necessary for execution. Different operating systems use different types of object files. In this class, we will be using ELF (Executable and Linkable Format), the object format used by Linux. Let’s start by taking a look at map.o
and recurse.o
. These are object files, so we will use the objdump
program to read them.
i386-objdump -D map.o
i386-objdump -D recurse.o
- What do the
.text
and.data
sections contain? Provide a qualitative description.
The assembler generates a symbol table which is part of the object file. The symbol table contains all the symbols that can be globally referenced (referenced outside the object file) from another object file (i.e. global/static variables and functions).
- What command do we use to view the symbols in an ELF file? (Hint: We can use
objdump
again, look atman objdump
to find the right flag).
Here’s an excerpt from the map.o symbol table:
00000000 g O .data 00000004 stuff
00000000 g F .text 00000060 main
...
00000000 *UND* 00000000 malloc
00000000 *UND* 00000000 recur
- What do the
g
,O
,F
, and*UND*
flags mean?
Finally, let’s link our 2 object files to create an executable.
i386-gcc -m32 map.o recurse.o -o map
Note that we could’ve just called i386-gcc -m32 map.c recurse.c -o map
on the C files to do this entire process in a single command. Often times build systems will separate these commands in order to speed up compile times (since only the changed files need to be recompiled).
- Examine the symbol table of the entire
map
program now. What has changed? Give a general description, including what happened torecur
.
objdump
can be used to look at more than just the symbol table—it can show us the structure of the executable. Run i386-objdump -x -d map
. You will see that your program has several segments, names of functions and variables in your program correspond to labels with addresses or values. The guts of everything is chunks of stuff within segments.
In the objdump
output these segments are under the section heading. There’s actually a slight nuance between these two terms which you can read more about online.
Using the output of objdump
, answer the following questions:
What segment(s)/section(s) contains
recur
(the function)? (The address ofrecur
inobjdump
will not be exactly the same as what you saw in gdb. An optional stretch exercise is to think about why. Hint: See the Wikipedia article on relocation.)What segment(s)/section(s) contains global variables? Hint: look for the variables
foo
andstuff
.Do you see the stack segment anywhere? What about the heap? Explain.
Based on the output of
map
, in which direction does the stack grow? Explain. (Reminder: Please usei386-exec ./map
to runmap
.)
When you ran map
, you might have noticed that it prints "CS362 is the best!"
. However, we wanted to print "CS162 is the best!"
.
Let’s see what happened by invoking the preprocessing stage. The compiler takes your C code and will output new C code. What does this really do? Time to find out!
To preprocess map.c
, run:
i386-gcc -m32 -E -o map.i map.c
You can see that gcc produces a
map.i
that is far larger than the originalmap.c
file. How didmap.c
insert theprintf
statement for"CS362 is the best!"
? Be specific, including how the#ifdef
directive played a role!Modify
Makefile
to make sure that"CS162 is the best!"
is printed instead. You may not modify or add any other files. Hint: Refer to this page from the GCC documentation.