Skip to main content Link Menu Expand (external link) Document Search Copy Copied

From source code to executable

Now that you’ve seen how map works, let’s take a dive into how we went from high-level C code to an executable.

There are 10 written questions for this section, and you must submit your responses to these questions on Gradescope.

Before we start, we’ll be using a few compiler flags which are likely new to you. Here’s a summary of the flags we’ll be using.

  • -Wall – Enables all compiler warnings

  • -m32 – Compiles the code for the i386 architecture.

  • -E - Invokes the PREPROCESSOR only.

  • -S – Invokes the COMPILER only.

  • -c – Invokes the COMPILER and ASSEMBLER only.

Important: Please use i386-gcc instead of gcc for this homework.

Let’s now invoke the compiler. The compiler takes high-level C code and produces a variant of x86 known as 8086 or i386 assembly.

To compile map.c, run:

i386-gcc -m32 -S -o map.S map.c

This will only invoke the compiler for map.c and output the assembly code in map.S.

  1. Generate recurse.S and find which instructions correspond to the recursive call of recur(i - 1).

Now we will assemble our compiled code into an executable. To assemble our code we can run:

i386-gcc -m32 -c map.S -o map.o

This turns our raw x86 code (map.S) into machine code or an object file (map.o).

We can also combine these steps by just running i386-gcc -m32 -c on our C file directly. We can run:

i386-gcc -m32 -c recurse.c -o recurse.o

The assembler converts the raw assembly code into an object file which contains code as well as other data and metadata necessary for execution. Different operating systems use different types of object files. In this class, we will be using ELF (Executable and Linkable Format), the object format used by Linux. Let’s start by taking a look at map.o and recurse.o. These are object files, so we will use the objdump program to read them.

i386-objdump -D map.o
i386-objdump -D recurse.o
  1. What do the .text and .data sections contain? Provide a qualitative description.

The assembler generates a symbol table which is part of the object file. The symbol table contains all the symbols that can be globally referenced (referenced outside the object file) from another object file (i.e. global/static variables and functions).

  1. What command do we use to view the symbols in an ELF file? (Hint: We can use objdump again, look at man objdump to find the right flag).

Here’s an excerpt from the map.o symbol table:

00000000 g O .data 00000004 stuff
00000000 g F .text 00000060 main
...
00000000 *UND* 00000000 malloc
00000000 *UND* 00000000 recur
  1. What do the g, O, F, and *UND* flags mean?

Finally, let’s link our 2 object files to create an executable.

i386-gcc -m32 map.o recurse.o -o map

Note that we could’ve just called i386-gcc -m32 map.c recurse.c -o map on the C files to do this entire process in a single command. Often times build systems will separate these commands in order to speed up compile times (since only the changed files need to be recompiled).

  1. Examine the symbol table of the entire map program now. What has changed? Give a general description, including what happened to recur.

objdump can be used to look at more than just the symbol table—it can show us the structure of the executable. Run i386-objdump -x -d map. You will see that your program has several segments, names of functions and variables in your program correspond to labels with addresses or values. The guts of everything is chunks of stuff within segments.

In the objdump output these segments are under the section heading. There’s actually a slight nuance between these two terms which you can read more about online.

Using the output of objdump, answer the following questions:

  1. What segment(s)/section(s) contains recur (the function)? (The address of recur in objdump will not be exactly the same as what you saw in gdb. An optional stretch exercise is to think about why. Hint: See the Wikipedia article on relocation.)

  2. What segment(s)/section(s) contains global variables? Hint: look for the variables foo and stuff.

  3. Do you see the stack segment anywhere? What about the heap? Explain.

  4. Based on the output of map, in which direction does the stack grow? Explain. (Reminder: Please use i386-exec ./map to run map.)

When you ran map, you might have noticed that it prints "CS362 is the best!". However, we wanted to print "CS162 is the best!".

Let’s see what happened by invoking the preprocessing stage. The compiler takes your C code and will output new C code. What does this really do? Time to find out!

To preprocess map.c, run:

i386-gcc -m32 -E -o map.i map.c
  1. You can see that gcc produces a map.i that is far larger than the original map.c file. How did map.c insert the printf statement for "CS362 is the best!"? Be specific, including how the #ifdef directive played a role!

  2. Modify Makefile to make sure that "CS162 is the best!" is printed instead. You may not modify or add any other files. Hint: Refer to this page from the GCC documentation.