Skip to main content Link Menu Expand (external link) Document Search Copy Copied

From source code to executable

Now that you’ve seen how map works, let’s take a dive into how we went from high-level C code to an executable.

There are 10 written questions for this section, and you must submit your responses to these questions on Gradescope.

Before we start, we’ll be using a few compiler flags which are likely new to you. Here’s a summary of the flags we’ll be using.

  • -Wall – Enables all compiler warnings

  • -m32 – Compiles the code for the i386 architecture. Please leave this flag out on instructional machines.

  • -S – Invokes the COMPILER only.

  • -c – Invokes the COMPILER and ASSEMBLER only.

Let’s start by invoking the compiler. The compiler takes high-level C code and produces a variant of x86 known as 8086 or i386 assembly.

Important: Please use i386-gcc instead of gcc for this homework.

To compile map.c, run:

i386-gcc -m32 -S -o map.S map.c

This will only invoke the compiler for map.c and output the assembly code in map.S.

  1. Generate recurse.S and find which instruction(s) corresponds to the recursive call of recur(i - 1).

Now we will assemble our compiled code into an executable. To assemble our code we can run:

i386-gcc -m32 -c map.S -o map.o

This turns our raw x86 code (map.S) into machine code or an object file (map.o).

We can also combine these steps by just running i386-gcc -m32 -c on our C file directly. We can run:

i386-gcc -m32 -c recurse.c -o recurse.o

The assembler converts the raw assembly code into an object file which contains code as well as other data and metadata necessary for execution. Different operating systems use different types of object files. In this class, we will be using ELF (Executable and Linkable Format), the object format used by Linux. Let’s start by taking a look at map.o and recurse.o. These are object files, so we will use the objdump program to read them.

i386-objdump -D map.o
i386-objdump -D recurse.o
  1. What do the .text and .data sections contain?

The assembler generates a symbol table which is part of the object file. The symbol table contains all the symbols that can be globally referenced (referenced outside the object file) from another object file (i.e. global/static variables and functions).

  1. What command do we use to view the symbols in an ELF file? (Hint: We can use objdump again, look at man objdump to find the right flag).

Here’s an excerpt from the map.o symbol table:

00000000 g O .data 00000004 stuff
00000000 g F .text 00000060 main
...
00000000 *UND* 00000000 malloc
00000000 *UND* 00000000 recur
  1. What do the g, O, F, and *UND* flags mean?

  2. Where else can we find a symbol for recur? Which file is this in? Copy and paste the relevant portion of the symbol table.

Finally, let’s link our 2 object files to create an executable.

i386-gcc -m32 map.o recurse.o -o map

Note that we could’ve just called i386-gcc -m32 map.c recurse.c -o map on the C files to do this entire process in a single command. Often times build systems will separate these commands in order to speed up compile times (since only the changed files need to be recompiled).

  1. Examine the symbol table of the entire map program now. What has changed?

objdump can be used to look at more than just the symbol table—it can show us the structure of the executable. Run i386-objdump -x -d map. You will see that your program has several segments, names of functions and variables in your program correspond to labels with addresses or values. The guts of everything is chunks of stuff within segments.

In the objdump output these segments are under the section heading. There’s actually a slight nuance between these two terms which you can read more about online.

Using the output of objdump, answer the following questions:

  1. What segment(s)/section(s) contains recur (the function)? (The address of recur in objdump will not be exactly the same as what you saw in gdb. An optional stretch exercise is to think about why. Hint: See the Wikipedia article on relocation.)

  2. What segment(s)/section(s) contains global variables? Hint: look for the variables foo and stuff.

  3. Do you see the stack segment anywhere? What about the heap? Explain.

  4. Based on the output of map, in which direction does the stack grow? Explain. (Reminder: Please use i386-exec ./map to run map.)