decomp-toolkit/docs/terminology.md

3.0 KiB

Terminology

DOL

A DOL file is the executable format used by GameCube and Wii games. It's essentially a raw binary with a header that contains information about the code and data sections, as well as the entry point.

ELF

An ELF file is the executable format used by most Unix-like operating systems. There are two common types of ELF files: relocatable and executable.

A relocatable ELF (.o, also called "object file") contains machine code and relocation information, and is used as input to the linker. Each object file is compiled from a single source file (.c, .cpp).

An executable ELF (.elf) contains the final machine code that can be loaded and executed. It can include information about symbols, debug information (DWARF), and sometimes information about the original relocations, but it is often missing some or all of these (referred to as "stripped").

Symbol

A symbol is a name that is assigned to a memory address. Symbols can be functions, variables, or other data.

Local symbols are only visible within the object file they are defined in.
These are usually defined as static in C/C++ or are compiler-generated.

Global symbols are visible to all object files, and their names must be unique.

Weak symbols are similar to global symbols, but can be replaced by a global symbol with the same name.
For example: the SDK defines a weak OSReport function, which can be replaced by a game-specific implementation.
Weak symbols are also used for functions generated by the compiler or as a result of C++ features, since they can exist in multiple object files. The linker will deduplicate these functions, keeping only the first copy.

Relocation

A relocation is essentially a pointer to a symbol. At compile time, the final address of a symbol is not known yet, therefore a relocation is needed. At link time, each symbol is assigned a final address, and the linker will use the relocations to update the machine code with the final addresses of the symbol.

Before:

# Unrelocated, instructions point to address 0 (unknown)
lis r3, 0
ori r3, r3, 0

After:

# Relocated, instructions point to 0x80001234
lis r3, 0x8000
ori r3, r3, 0x1234

Once the linker performs the relocation with the final address, the relocation is no longer needed. Still, sometimes the final ELF will still contain the relocation information, but the conversion to DOL will always remove it.

When we analyze a file, we attempt to rebuild the relocations. This is useful for several reasons:

  • It allows us to split the file into relocatable objects. Each object can then be replaced with a decompiled version, as matching code is written.
  • It allows us to modify or add code and data to the game and have all machine code still to point to the correct symbols, which may now be in a different location.
  • It allows us to view the machine code in a disassembler and show symbol names instead of raw addresses.