- Notifications
You must be signed in to change notification settings - Fork 15.1k
Description
I have ported some code from gcc/ld to clang/lld. While actual code+data sizes went down, the output elf file size went up considerably. Upon inspection, this is because elf output by lld has comparatively many segments which are NOBITS/NOLOAD, yet map to unique file offsets (unique=not used by any other segment). This results in a large(r) amount of the elf being wasted space, compared to ld.
// Segments are contiguous memory regions that has the same attributes // (e.g. executable or writable). There is one phdr for each segment. // Therefore, we need to create a new phdr when the next section has // incompatible flags or is loaded at a discontiguous address or memory // region using AT or AT> linker script command, respectively. // // As an exception, we don't create a separate load segment for the ELF // headers, even if the first "real" output has an AT or AT> attribute. // // In addition, NOBITS sections should only be placed at the end of a LOAD // segment (since it's represented as p_filesz < p_memsz). If we have a // not-NOBITS section after a NOBITS, we create a new LOAD for the latter // even if flags match, so as not to require actually writing the // supposed-to-be-NOBITS section to the output file. (However, we cannot do // so when hasSectionsCommand, since we cannot introduce the extra alignment // needed to create a new LOAD) uint64_t newFlags = computeFlags(ctx, sec->getPhdrFlags()); uint64_t incompatible = flags ^ newFlags; if (!(newFlags & PF_W)) { // When --no-rosegment is specified, RO and RX sections are compatible. if (ctx.arg.singleRoRx) incompatible &= ~PF_X; // When --no-xosegment is specified (the default), XO and RX sections are // compatible. if (ctx.arg.singleXoRx) incompatible &= ~PF_R; } if (incompatible) load = nullptr; bool sameLMARegion = load && !sec->lmaExpr && sec->lmaRegion == load->firstSec->lmaRegion; if (load && sec != relroEnd && sec->memRegion == load->firstSec->memRegion && (sameLMARegion || load->lastSec == ctx.out.programHeaders.get()) && (ctx.script->hasSectionsCommand || sec->type == SHT_NOBITS || load->lastSec->type != SHT_NOBITS)) { load->p_flags |= newFlags; } else { load = addHdr(PT_LOAD, newFlags); flags = newFlags; } load->add(sec);This comment isn't really reflected in the code:
[...] loaded at a discontiguous address or memory region using AT or AT> linker script command, respectively.
The code considers a pair of actually-contiguous segments as noncontiguous if the latter segment has any lmaExpr, it does not check if the segments are actually contiguous (after size + alignment) in LMA space. This condition may occur in linker scripts which use AT in such a way that a sequence of output sections will be known to be physically contiguous (for example, an array of output sections used to describe an array of stacks). A linker script could naively expect such output sections to consume minimal amount of space in the elf, as they describe a contiguous range of NOBITS.
ld will additionally alias NOBITS sections to file offsets which occur earlier in the file, not just the immediately preceding segment, although this is obvious downside of lld's simple layout scheme.
As an aside, I do have to wonder why NOBITS segments wind up taking any file space at all.