Chapter 2. Basic Principles of the Tools 7
2.6.1. Object Files
The assembler creates object files that, by convention, have the .o extension. These are binary files
that contain the assembled source code, information to help the linker integrate the object file into an
executable program, debugging information and tables of all of the symbols used in the source code.
Special programs exist to manipulate object files. For example, objdump can disassemble an object
file back into assembler source code and ar can group together multiple object files into an archive or
library file.
2.6.2. Assembler Directives
Assembler directives are commands inside the assembler source files that control how the object file
is generated. They are also known as pseudo-ops, or pseudo-operations, because they can look like
commands in the assembler programming language.
Assembler directives always start with a period (.). The rest of their name is letters, usually in lower
case. They have a wide range of different uses, such as specifying alignments, inserting constants into
the output, and selecting in which sections of the output file the assembled machine instructions are
placed.
2.7. ld, the GNU Linker
The GNU linker, ld, combines multiple object files together and creates an executable program from
them. It does this by resolving references between the different object files, grouping together similar
sections in the object files into one place, arranging for these sections to be loaded at the correct
addresses in memory, and generating the necessary header information at the start of a file that allows
it to be run.
The linker moves blocks of bytes of your program to their load-time addresses. These blocks slide
to their addresses as rigid units; their length does not change and neither does the order of the bytes
within them. Such a rigid unit is called a section. Assigning runtime addresses to sections is called
relocation. It includes the task of adjusting mentions of object-file addresses so they refer to the proper
runtime addresses.
Each section in an object file has a name and a size. Most sections also have an associated block
of data, known as the section contents. A section may be marked as allocatable, meaning that space
should be reserved for it in memory when the executable starts running. A section may also be marked
as loadable, meaning that its contents should be loaded into memory when the executable starts. A
section which is allocatable but not loadable will have a zero-filled area of memory created for it.
A section, which is neither loadable nor allocatable, typically contains some sort of debugging infor-
mation. Every loadable or allocatable output section has two addresses associated with it. The first is
the virtual memory address (VMA), the address the section will have when the executable is running.
The second is the load memory address (LMA), which is the address in memory where the section will
loaded. In most cases the two addresses will be the same. An example of when they might be different
is when a data section is loaded into ROM, and then copied into RAM when the program starts. This
technique is often used to initialize global variables in a ROM-based system. In this case, the ROM
address would be the LMA, and the RAM address would be the VMA. To review the sections in an
object file, use the objdump binary utility with the -h option.
Every object file also has a list of symbols, known as the symbol table. A symbol may be defined or
undefined. Each symbol has a name, and each defined symbol has an address. If you compile a C or
C++ program into an object file, you get a defined symbol for every defined function and global or
static variable. Every undefined function or global variable, which is referenced in the input file, will
become an undefined symbol. You can refer to the symbols in an object file by using the nm binary
utility, or by using the objdump binary utility with the -t option.
Comentários a estes Manuais