565 lines
16 KiB
Markdown
565 lines
16 KiB
Markdown
# decomp-toolkit [![Build Status]][actions]
|
|
|
|
[Build Status]: https://github.com/encounter/decomp-toolkit/actions/workflows/build.yml/badge.svg
|
|
[actions]: https://github.com/encounter/decomp-toolkit/actions
|
|
|
|
Yet another GameCube/Wii decompilation toolkit.
|
|
|
|
decomp-toolkit functions both as a command-line tool for developers, and as a replacement for various parts of a
|
|
decompilation project's build system.
|
|
|
|
For use in a new decompilation project, see [dtk-template](https://github.com/encounter/dtk-template), which provides a
|
|
project structure and build system that uses decomp-toolkit under the hood.
|
|
|
|
## Sections
|
|
|
|
- [Goals](#goals)
|
|
- [Background](#background)
|
|
- [Analyzer features](#analyzer-features)
|
|
- [Other approaches](docs/other_approaches.md)
|
|
- [Terminology](docs/terminology.md)
|
|
- [Commands](#commands)
|
|
- [ar create](#ar-create)
|
|
- [ar extract](#ar-extract)
|
|
- [demangle](#demangle)
|
|
- [disc info](#disc-info)
|
|
- [disc extract](#disc-extract)
|
|
- [disc convert](#disc-convert)
|
|
- [disc verify](#disc-verify)
|
|
- [dol info](#dol-info)
|
|
- [dol split](#dol-split)
|
|
- [dol diff](#dol-diff)
|
|
- [dol apply](#dol-apply)
|
|
- [dol config](#dol-config)
|
|
- [dwarf dump](#dwarf-dump)
|
|
- [elf disasm](#elf-disasm)
|
|
- [elf fixup](#elf-fixup)
|
|
- [elf2dol](#elf2dol)
|
|
- [map](#map)
|
|
- [rel info](#rel-info)
|
|
- [rel merge](#rel-merge)
|
|
- [rso info](#rso-info)
|
|
- [rso make](#rso-make)
|
|
- [shasum](#shasum)
|
|
- [nlzss decompress](#nlzss-decompress)
|
|
- [rarc list](#rarc-list)
|
|
- [rarc extract](#rarc-extract)
|
|
- [u8 list](#u8-list)
|
|
- [u8 extract](#u8-extract)
|
|
- [vfs ls](#vfs-ls)
|
|
- [vfs cp](#vfs-cp)
|
|
- [yay0 decompress](#yay0-decompress)
|
|
- [yay0 compress](#yay0-compress)
|
|
- [yaz0 decompress](#yaz0-decompress)
|
|
- [yaz0 compress](#yaz0-compress)
|
|
|
|
## Goals
|
|
|
|
- Automate as much as possible, allowing developers to focus on matching code rather than months-long tedious setup.
|
|
- Provide highly **accurate** and performant analysis and tooling.
|
|
- Provide everything in a single portable binary. This simplifies project setup: a script can simply fetch the
|
|
binary from GitHub.
|
|
- Replace common usages of msys2 and GNU assembler, eliminating the need to depend on devkitPro.
|
|
- Integrate well with other decompilation tooling like [objdiff](https://github.com/encounter/objdiff) and
|
|
[decomp.me](https://decomp.me).
|
|
|
|
## Background
|
|
|
|
The goal of a matching decompilation project is to write C/C++ code that compiles back to the _exact_ same binary as
|
|
the original game. This often requires using the same compiler as the original game. (For GameCube and Wii,
|
|
[Metrowerks CodeWarrior](https://en.wikipedia.org/wiki/CodeWarrior))
|
|
|
|
When compiling C/C++ code, the compiler (in our case, `mwcceppc`) generates an object file (`.o`) for every source file.
|
|
This object file contains the compiled machine code, as well as information that the linker (`mwldeppc`) uses to
|
|
generate the final executable.
|
|
|
|
One way to verify that our code is a match is by taking any code that has been decompiled, and
|
|
linking it alongside portions of the original binary that have not been decompiled yet. First, we create relocatable
|
|
objects from the original binary:
|
|
|
|
<picture>
|
|
<source media="(prefers-color-scheme: dark)" srcset="assets/diagram_dark.svg">
|
|
<source media="(prefers-color-scheme: light)" srcset="assets/diagram_light.svg">
|
|
<img alt="Binary split diagram" src="assets/diagram.svg">
|
|
</picture>
|
|
|
|
(Heavily simplified)
|
|
|
|
Then, each object can be replaced by a decompiled version as matching code is written. If the linker still generates a
|
|
binary that is byte-for-byte identical to the original, then we know that the decompiled code is a match.
|
|
|
|
decomp-toolkit provides tooling for analyzing and splitting the original binary into relocatable objects, as well
|
|
as generating the linker script and other files needed to link the decompiled code.
|
|
|
|
## Analyzer features
|
|
|
|
**Function boundary analysis**
|
|
Discovers function boundaries with high accuracy. Uses various heuristics to disambiguate tail calls from
|
|
inner-function control flow.
|
|
|
|
**Signature analysis**
|
|
Utilizes a built-in signature database to identify common Metrowerks and SDK functions and objects.
|
|
This also helps decomp-toolkit automatically generate required splits, like `__init_cpp_exceptions`.
|
|
|
|
**Relocation analysis**
|
|
Performs control-flow analysis and rebuilds relocations with high accuracy.
|
|
With some manual tweaking (mainly in data), this should generate fully-shiftable objects.
|
|
|
|
**Section analysis**
|
|
Automatically identifies DOL and REL sections based on information from signature and relocation analysis.
|
|
|
|
**Object analysis**
|
|
Attempts to identify the type and size of data objects by analyzing usage.
|
|
Also attempts to identify string literals, wide string literals, and string tables.
|
|
|
|
**Splitting**
|
|
Generates split object files in memory based on user configuration.
|
|
In order to support relinking with `mwldeppc.exe`, any **unsplit** `.ctors`, `.dtors`, `extab` and `extabindex` entries
|
|
are analyzed and automatically split along with their associated functions. This ensures that the linker will properly
|
|
generate these sections without any additional configuration.
|
|
A topological sort is performed to determine the final link order of the split objects.
|
|
|
|
**Object file writing**
|
|
Writes object files directly, with no assembler required. (Bye devkitPPC!)
|
|
If desired, optionally writes GNU assembler-compatible files alongside the object files.
|
|
|
|
**Linker script generation**
|
|
Generates `ldscript.lcf` for `mwldeppc.exe`.
|
|
|
|
## Commands
|
|
|
|
### ar create
|
|
|
|
Create a static library (.a) from the input objects.
|
|
|
|
```shell
|
|
$ dtk ar create out.a input_1.o input_2.o
|
|
# or
|
|
$ echo input_1.o >> rspfile
|
|
$ echo input_2.o >> rspfile
|
|
$ dtk ar create out.a @rspfile
|
|
```
|
|
|
|
### ar extract
|
|
|
|
Extracts the contents of static library (.a) files.
|
|
|
|
Accepts multiple files, glob patterns (e.g. `*.a`) and response files (e.g. `@rspfile`).
|
|
|
|
Options:
|
|
- `-o`, `--out <output-dir>`: Output directory. Defaults to the current directory.
|
|
- `-v`, `--verbose`: Verbose output.
|
|
- `-q`, `--quiet`: Suppresses all output except errors.
|
|
|
|
```shell
|
|
# Extracts to outdir
|
|
$ dtk ar extract lib.a -o outdir
|
|
|
|
# With multiple inputs, extracts to separate directories
|
|
# Extracts to outdir/lib1, outdir/lib2
|
|
$ dtk ar extract lib1.a lib2.a -o outdir
|
|
```
|
|
|
|
### demangle
|
|
|
|
Demangles CodeWarrior C++ symbols. A thin wrapper for [cwdemangle](https://github.com/encounter/cwdemangle).
|
|
|
|
```shell
|
|
$ dtk demangle 'BuildLight__9CGuiLightCFv'
|
|
CGuiLight::BuildLight() const
|
|
```
|
|
|
|
### disc info
|
|
|
|
_`disc` commands are wrappers around the [nod](https://github.com/encounter/nod-rs) library
|
|
and its `nodtool` command line tool._
|
|
|
|
Displays information about disc images.
|
|
|
|
To list the contents of a disc image, use [vfs ls](#vfs-ls).
|
|
|
|
Supported disc image formats:
|
|
|
|
- ISO (GCM)
|
|
- WIA / RVZ
|
|
- WBFS (+ NKit 2 lossless)
|
|
- CISO (+ NKit 2 lossless)
|
|
- NFS (Wii U VC)
|
|
- GCZ
|
|
- TGC
|
|
|
|
```shell
|
|
$ dtk disc info /path/to/game.iso
|
|
```
|
|
|
|
### disc extract
|
|
|
|
Extracts the contents of disc images to a directory.
|
|
|
|
See [disc info](#disc-info) for supported formats.
|
|
|
|
> [!NOTE]
|
|
> [vfs cp](#vfs-cp) is more flexible and supports disc images.
|
|
|
|
```shell
|
|
$ dtk disc extract /path/to/game.iso [outdir]
|
|
```
|
|
|
|
By default, only the main **data** partition is extracted.
|
|
Use the `-p`/`--partition` option to choose a different partition.
|
|
(Options: `all`, `data`, `update`, `channel`, or a partition index)
|
|
|
|
### disc convert
|
|
|
|
Converts any supported disc image to raw ISO (GCM).
|
|
|
|
If the format is lossless, the output will be identical to the original disc image.
|
|
|
|
See [disc info](#disc-info) for supported formats.
|
|
|
|
```shell
|
|
$ dtk disc convert /path/to/game.wia /path/to/game.iso
|
|
```
|
|
|
|
### disc verify
|
|
|
|
Hashes the contents of a disc image and verifies it against a built-in [Redump](http://redump.org/) database.
|
|
|
|
See [disc info](#disc-info) for supported formats.
|
|
|
|
```shell
|
|
$ dtk disc verify /path/to/game.iso
|
|
```
|
|
|
|
### dol info
|
|
|
|
Analyzes a DOL file and outputs information section and symbol information.
|
|
|
|
See [vfs ls](#vfs-ls) for information on the VFS abstraction.
|
|
|
|
```shell
|
|
$ dtk dol info input.dol
|
|
# or, directly from a disc image
|
|
$ dtk dol info 'disc.rvz:sys/main.dol'
|
|
```
|
|
|
|
### dol split
|
|
|
|
> [!IMPORTANT]
|
|
> **This command is intended to be used as part of a decompilation project's build system.**
|
|
> For an example project structure and for documentation on the configuration, see
|
|
> [dtk-template](https://github.com/encounter/dtk-template).
|
|
|
|
Analyzes and splits a DOL file into relocatable objects based on user configuration.
|
|
|
|
```shell
|
|
$ dtk dol split config.yml target
|
|
```
|
|
|
|
### dol diff
|
|
|
|
Simple diff tool for issues in a linked ELF. (Yes, not DOL. It's misnamed.)
|
|
Tries to find the most obvious difference causing a mismatch.
|
|
|
|
Pass in the project configuration file, and the path to the linked ELF file to compare against.
|
|
|
|
```shell
|
|
$ dtk dol diff config.yml build/main.elf
|
|
```
|
|
|
|
### dol apply
|
|
|
|
Applies updated symbols from a linked ELF to the project configuration. (Again, misnamed.)
|
|
|
|
Useful after matching a file. It will pull updated symbol information from the final result.
|
|
|
|
```shell
|
|
$ dtk dol apply config.yml build/main.elf
|
|
```
|
|
|
|
### dol config
|
|
|
|
Generates an initial project configuration file from a DOL (& RELs).
|
|
|
|
Pass in the DOL file, and any REL files that are linked with it.
|
|
Or, for Wii games, pass in the `selfile.sel`. (Not RSOs)
|
|
|
|
```shell
|
|
$ dtk dol config main.dol rels/*.rel -o config.yml
|
|
```
|
|
|
|
### dwarf dump
|
|
|
|
Dumps DWARF 1.1 information from an ELF file. (Does **not** support DWARF 2+)
|
|
|
|
```shell
|
|
$ dtk dwarf dump input.elf
|
|
```
|
|
|
|
### elf disasm
|
|
|
|
Disassemble an unstripped CodeWarrior ELF file. Attempts to automatically split objects and rebuild relocations
|
|
when possible.
|
|
|
|
```shell
|
|
$ dtk elf disasm input.elf out
|
|
```
|
|
|
|
### elf fixup
|
|
|
|
Fixes issues with GNU assembler-built objects to ensure compatibility with `mwldeppc.exe`.
|
|
|
|
- Strips empty sections
|
|
- Generates section symbols for all allocatable sections
|
|
- Where possible, replaces section-relative relocations with direct relocations.
|
|
- Adds an ` (asm)` suffix to the file symbol. (For matching progress calculation)
|
|
|
|
```shell
|
|
# input and output can be the same
|
|
$ dtk elf fixup file.o file.o
|
|
```
|
|
|
|
### elf2dol
|
|
|
|
Creates a DOL file from the provided ELF file.
|
|
|
|
```shell
|
|
$ dtk elf2dol input.elf output.dol
|
|
# or, to ignore certain sections
|
|
$ dtk elf2dol input.elf output.dol --ignore debug_section1 --ignore debug_section2
|
|
```
|
|
|
|
### map
|
|
|
|
Processes CodeWarrior map files and provides information about symbols and TUs.
|
|
|
|
```shell
|
|
$ dtk map entries Game.MAP 'Unit.o'
|
|
# Outputs all symbols that are referenced by Unit.o
|
|
# This is useful for finding deduplicated weak functions,
|
|
# which only show on first use in the link map.
|
|
|
|
$ dtk map symbol Game.MAP 'Function__5ClassFv'
|
|
# Outputs reference information for Function__5ClassFv
|
|
# CodeWarrior link maps can get very deeply nested,
|
|
# so this is useful for emitting direct references
|
|
# in a readable format.
|
|
```
|
|
|
|
### rel info
|
|
|
|
Prints information about a REL file.
|
|
|
|
See [vfs ls](#vfs-ls) for information on the VFS abstraction.
|
|
|
|
```shell
|
|
$ dtk rel info input.rel
|
|
# or, directly from a disc image
|
|
$ dtk rel info 'disc.rvz:files/RELS.arc:amem/d_a_tag_so.rel'
|
|
```
|
|
|
|
### rel merge
|
|
|
|
Merges a DOL file and associated RELs into a single ELF file, suitable for analysis in your favorite
|
|
reverse engineering software.
|
|
|
|
```shell
|
|
$ dtk rel info main.dol rels/*.rel -o merged.elf
|
|
```
|
|
|
|
### rso info
|
|
|
|
> [!WARNING]
|
|
> This command is not yet functional.
|
|
|
|
Prints information about an RSO file.
|
|
|
|
```shell
|
|
$ dtk rso info input.rso
|
|
```
|
|
|
|
### rso make
|
|
|
|
> [!WARNING]
|
|
> This command does not yet support creating SEL files.
|
|
|
|
Creates an RSO file from a relocatable ELF file.
|
|
|
|
Options:
|
|
- `-o`, `--output <File>`: Output RSO file.
|
|
- `-m`, `--module-name <Name>`: Module name (or path). Default: input name
|
|
- `-e`, `--export <File>`: File containing exported symbol names. (Newline separated)
|
|
|
|
```shell
|
|
$ dtk rso make input.elf -o input.rso
|
|
```
|
|
|
|
### shasum
|
|
|
|
Calculate and verify SHA-1 hashes.
|
|
|
|
```shell
|
|
$ dtk shasum baserom.dol
|
|
949c5ed7368aef547e0b0db1c3678f466e2afbff baserom.dol
|
|
|
|
$ dtk shasum -c baserom.sha1
|
|
baserom.dol: OK
|
|
```
|
|
|
|
### nlzss decompress
|
|
|
|
Decompresses NLZSS-compressed files.
|
|
|
|
```shell
|
|
$ dtk nlzss decompress input.bin.lz -o output.bin
|
|
# or, for batch processing
|
|
$ dtk nlzss decompress rels/*.lz -o rels
|
|
```
|
|
|
|
### rarc list
|
|
|
|
> [!NOTE]
|
|
> [vfs ls](#vfs-ls) is more flexible and supports RARC archives.
|
|
> This command is now equivalent to `dtk vfs ls -r input.arc:`
|
|
|
|
Lists the contents of an RARC (older .arc) archive.
|
|
|
|
```shell
|
|
$ dtk rarc list input.arc
|
|
```
|
|
|
|
### rarc extract
|
|
|
|
> [!NOTE]
|
|
> [vfs cp](#vfs-cp) is more flexible and supports RARC archives.
|
|
> This command is now equivalent to `dtk vfs cp input.arc: output_dir`
|
|
|
|
Extracts the contents of an RARC (older .arc) archive.
|
|
|
|
```shell
|
|
$ dtk rarc extract input.arc -o output_dir
|
|
```
|
|
|
|
### u8 list
|
|
|
|
> [!NOTE]
|
|
> [vfs ls](#vfs-ls) is more flexible and supports U8 archives.
|
|
> This command is now equivalent to `dtk vfs ls -r input.arc:`
|
|
|
|
Extracts the contents of a U8 (newer .arc) archive.
|
|
|
|
```shell
|
|
$ dtk u8 list input.arc
|
|
```
|
|
|
|
### u8 extract
|
|
|
|
> [!NOTE]
|
|
> [vfs cp](#vfs-cp) is more flexible and supports U8 archives.
|
|
> This command is now equivalent to `dtk vfs cp input.arc: output_dir`
|
|
|
|
Extracts the contents of a U8 (newer .arc) archive.
|
|
|
|
```shell
|
|
$ dtk u8 extract input.arc -o output_dir
|
|
```
|
|
|
|
### vfs ls
|
|
|
|
decomp-toolkit has a powerful virtual filesystem (VFS) abstraction that allows you to work with a
|
|
variety of containers. All operations happen in memory with minimal overhead and no temporary files.
|
|
|
|
Supported containers:
|
|
|
|
- Disc images (see [disc info](#disc-info) for supported formats)
|
|
- RARC archives (older .arc)
|
|
- U8 archives (newer .arc)
|
|
|
|
Supported compression formats are handled transparently:
|
|
- Yay0 (SZP) / Yaz0 (SZS)
|
|
- NLZSS (.lz) (Use `:nlzss` in the path)
|
|
|
|
`vfs ls` lists the contents of a container or directory.
|
|
|
|
Options:
|
|
|
|
- `-r`, `--recursive`: Recursively list contents.
|
|
- `-s`, `--short`: Only list file names.
|
|
|
|
Examples:
|
|
|
|
```shell
|
|
# List the contents of the `amem` directory inside `RELS.arc` in a disc image
|
|
$ dtk vfs ls 'disc.rvz:files/RELS.arc:amem'
|
|
# List the contents of `RELS.arc` recursively
|
|
$ dtk vfs ls -r 'disc.rvz:files/RELS.arc:'
|
|
|
|
# All commands that accept a file path can also accept a VFS path
|
|
$ dtk rel info 'disc.rvz:files/RELS.arc:amem/d_a_tag_so.rel'
|
|
# Example disc image within a disc image
|
|
$ dtk dol info 'disc.rvz:files/zz_demo.tgc:sys/main.dol'
|
|
````
|
|
|
|
### vfs cp
|
|
|
|
See [vfs ls](#vfs-ls) for information on the VFS abstraction.
|
|
|
|
`vfs cp` copies files and directories recursively to the host filesystem.
|
|
|
|
Options:
|
|
|
|
- `--no-decompress`: Do not decompress files when copying.
|
|
- `-q`, `--quiet`: Suppresses all output except errors.
|
|
|
|
Examples:
|
|
|
|
```shell
|
|
# Extract a file from a nested path in a disc image to the current directory
|
|
$ dtk vfs cp 'disc.rvz:files/RELS.arc:amem/d_a_tag_so.rel' .
|
|
|
|
# Directories are copied recursively, making it easy to extract entire archives
|
|
$ dtk vfs cp 'disc.rvz:files/RELS.arc:' rels
|
|
# Or, to disable automatic decompression
|
|
$ dtk vfs cp --no-decompress 'disc.rvz:files/RELS.arc:' rels
|
|
```
|
|
|
|
### yay0 decompress
|
|
|
|
Decompresses Yay0-compressed files.
|
|
|
|
```shell
|
|
$ dtk yay0 decompress input.bin.yay0 -o output.bin
|
|
# or, for batch processing
|
|
$ dtk yay0 decompress rels/*.yay0 -o rels
|
|
```
|
|
|
|
### yay0 compress
|
|
|
|
Compresses files using Yay0 compression.
|
|
|
|
```shell
|
|
$ dtk yay0 compress input.bin -o output.bin.yay0
|
|
# or, for batch processing
|
|
$ dtk yay0 compress rels/* -o rels
|
|
```
|
|
|
|
### yaz0 decompress
|
|
|
|
Decompresses Yaz0-compressed files.
|
|
|
|
```shell
|
|
$ dtk yaz0 decompress input.bin.yaz0 -o output.bin
|
|
# or, for batch processing
|
|
$ dtk yaz0 decompress rels/*.yaz0 -o rels
|
|
```
|
|
|
|
### yaz0 compress
|
|
|
|
Compresses files using Yaz0 compression.
|
|
|
|
```shell
|
|
$ dtk yaz0 compress input.bin -o output.bin.yaz0
|
|
# or, for batch processing
|
|
$ dtk yaz0 compress rels/* -o rels
|
|
```
|