FROM SOURCE TOBINARY How GNU Toolchain Works 從原始碼到二進制 Luse Cheng ソースからバイナリへ Deputy Manager, Andes Technology Von Quelle Zu Binären Jim Huang ( 黃敬群 ) <jserv@0xlab.org> De source au binaire Developer & Co-founder, 0xlab Desde fuente a binario March 31, 2011 / 臺北科技大學 Binarium ut a fonte
Binutils – GNULinker Linker 的工作 ( 一般靜態連結執行檔 ) 把所有目的檔彙整成執行檔 上窮碧落下黃泉 (Symbol Resolve) 一切依法處理 ( 處理 Relocation Type) TEXT TEXT DATA TEXT LINKER DATA TEXT DATA DATA
47.
Dynamically Linked SharedLibraries m.c a.c Translators Translators (cc1, as) (cc1,as) Shared Library Dynamically relocatable object files m.o a.o Linker (ld) $ ldd hello libc.so.6 => Partially linked /lib/ld-linux.so.2 (0x00524000) executable program libc.so (on disk) ar g vect or Loader / Dynamic Linker main( ) libc.so functions called by m.c (ld-linux.so) printf( ) and a.c are loaded, linked, and .. . . (potentially) shared among Fully linked executable processes. (in memory) Program’
48.
Relocatable Object Files Executable Object File system code .text 0 headers system data .data system code main() .text a() main() .text m.o int e = 7 .data more system code system data int e = 7 .data a() .text int *ep = &e int x = 15 a.o int *ep = &e .data uninitialized data .bss int x = 15 .symtab int y .bss .debug a.c int e=7; extern int e; m.c int *ep=&e, x=15, y; int main() { int r = a(); int a() { exit(0); return *ep+x+y; } }
49.
 每個 symbol 都賦予一個特定值,一般來說就是 memory address  Code  symbol definitions / reference  Reference  local / external Local symbol (ep) External 的 definition symbol (e) m.c a.c 的 reference int e=7; extern int e; Local symbol (e) int main() { int *ep=&e; 的 definition int r = a(); int x=15; exit(0); int y; } int a() { Local symbol External return *ep+x+y; (x, y) symbol (exit) } External 的 definition 的 reference symbol (a) Local symbol (a) Local symbol 的 reference 的 definition (x, y) 的 reference
50.
GCC Linker -ld m.c int e=7; Disassembly of section .text: int main() { 00000000 <main>: 00000000 <main>: 0: 55 pushl %ebp int r = a(); 1: 89 e5 movl %esp,%ebp exit(0); 3: e8 fc ff ff ff call 4 <main+0x4> } 4: R_386_PC32 a 8: 6a 00 pushl $0x0 a: e8 fc ff ff ff call b <main+0xb> Relocation Info b: R_386_PC32 exit f: 90 nop Disassembly of section .data: 00000000 <e>: 0: 07 00 00 00
那些 Linker 要做的事 Linker知道什麼 :  每個 .text 與 .data 區段的長度  .text 與 .data 區段的順序 Linker 的運算 :  absolute address of each label to be jumped to (internal or external) and each piece of data being referenced
53.
 Page size  Magic number  Virtual address memory segment   type (.o / .so / exec) Machine ELF (sections) (Executable and  byte order Linkable Format)  Segment size  … 0 ELF header  Initialized (static) data Program header table  code (required for executables)  Un-initialized (static) data .text section  Block started by symbol  Has section header but .data section occupies no space .bss section 注意: .dynsym 還保留 .symtab .rel.txt Runtime 只需要左邊欄位 .rel.data ELF header 可透過“ strip” 指令去除不 .debug Program header table 需要的 section (required for executables) Section header table (required for relocatables) .text section .data section .bss section
Relocation: 與平台相關的實做 glibc elf/dynamic-link.h /* This can't just be an inline function because GCC is too dumb to inline functions containing inlines themselves. */ # define ELF_DYNAMIC_RELOCATE(map, lazy, consider_profile) do { int edr_lazy = elf_machine_runtime_setup ((map), (lazy), (consider_profile)); ELF_DYNAMIC_DO_REL ((map), edr_lazy); ELF_DYNAMIC_DO_RELA ((map), edr_lazy); } while (0) glibc sysdeps/i386/dl-machine.h /* Set up the loaded object described by L so its unrelocated PLT entries will jump to the ondemand fixup code in dlruntime.c. */ static inline int __attribute__ ((unused, always_inline)) elf_machine_runtime_setup (struct link_map *l, int lazy, int profile) { : got[1] = (Elf32_Addr) l; /* Identify this shared object. */ : got[2] = (Elf32_Addr) &_dl_runtime_resolve; : ELF resolver }
參考資料 Loader and Linker,John R. Levine 2000 程序員的自我修養 – 連結、載入與程式庫 LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation http://llvm.org/pubs/2004-01-30-CGO-LLVM.html GCC, the GNU Compiler Collection http://gcc.gnu.org/ GNU Binutils http://www.gnu.org/software/binutils/ Embedded GLIBC http://www.eglibc.org/home