aboutsummaryrefslogtreecommitdiff
path: root/src/link/Elf.zig
AgeCommit message (Collapse)Author
2025-01-15rework error handling in the backendsAndrew Kelley
2025-01-15elf linker: conform to explicit error setsAndrew Kelley
2025-01-15macho linker conforms to explicit error sets, againAndrew Kelley
2025-01-15remove "FIXME" from codebaseAndrew Kelley
See #363. Please file issues rather than making TODO comments.
2025-01-15macho linker: conform to explicit error setsAndrew Kelley
Makes linker functions have small error sets, required to report diagnostics properly rather than having a massive error set that has a lot of codes. Other linker implementations are not ported yet. Also the branch is not passing semantic analysis yet.
2025-01-15wasm linker: aggressive DODificationAndrew Kelley
The goals of this branch are to: * compile faster when using the wasm linker and backend * enable saving compiler state by directly copying in-memory linker state to disk. * more efficient compiler memory utilization * introduce integer type safety to wasm linker code * generate better WebAssembly code * fully participate in incremental compilation * do as much work as possible outside of flush(), while continuing to do linker garbage collection. * avoid unnecessary heap allocations * avoid unnecessary indirect function calls In order to accomplish this goals, this removes the ZigObject abstraction, as well as Symbol and Atom. These abstractions resulted in overly generic code, doing unnecessary work, and needless complications that simply go away by creating a better in-memory data model and emitting more things lazily. For example, this makes wasm codegen emit MIR which is then lowered to wasm code during linking, with optimal function indexes etc, or relocations are emitted if outputting an object. Previously, this would always emit relocations, which are fully unnecessary when emitting an executable, and required all function calls to use the maximum size LEB encoding. This branch introduces the concept of the "prelink" phase which occurs after all object files have been parsed, but before any Zcu updates are sent to the linker. This allows the linker to fully parse all objects into a compact memory model, which is guaranteed to be complete when Zcu code is generated. This commit is not a complete implementation of all these goals; it is not even passing semantic analysis.
2025-01-05Added support for thin ltoTravis Lange
2025-01-05link: new incremental line number update APImlugg
2024-12-31link/Elf.zig: set stack size and build-id for dynamic libraries.Jan200101
2024-11-24std.Target: Add Os.HurdVersionRange for Os.Tag.hurd.Alex Rønne Petersen
This is necessary since isGnuLibC() is true for hurd, so we need to be able to represent a glibc version for it. Also add an Os.TaggedVersionRange.gnuLibCVersion() convenience function.
2024-11-22link: use target to determine risc-v eflag validityDavid Rubin
2024-11-03Merge pull request #21599 from alexrp/thumb-portingAlex Rønne Petersen
2024-11-03std.Target: Replace isARM() with isArmOrThumb() and rename it to isArm().Alex Rønne Petersen
The old isARM() function was a portability trap. With the name it had, it seemed like the obviously correct function to use, but it didn't include Thumb. In the vast majority of cases where someone wants to ask "is the target Arm?", Thumb *should* be included. There are exactly 3 cases in the codebase where we do actually need to exclude Thumb, although one of those is in Aro and mirrors a check in Clang that is itself likely a bug. These rare cases can just add an extra isThumb() check.
2024-11-02std.Target: Add muslabin32 and muslabi64 tags to Abi.Alex Rønne Petersen
Once we upgrade to LLVM 20, these should be lowered verbatim rather than to simply musl. Similarly, the special case in llvmMachineAbi() should go away.
2024-10-30link.File.Wasm: parse inputs in compilation pipelineAndrew Kelley
Primarily, this moves linker input parsing from flush() into the linker task queue, which is executed simultaneously with the frontend. I also made it avoid redundantly opening the same archive file N times for each object file inside. Furthermore, hard code fixed buffer stream rather than using a generic stream type. Finally, I fixed the error handling of the Wasm.Archive.parse function. Please pay attention to this pattern of returning a struct rather than accepting a mutable struct as an argument. This ensures function-level atomicity and makes resource management straightforward. Deletes the file and path fields from Archive and Object. Removed a well-meaning but ultimately misguided suggestion about how to think about ZigObject since thinking about it that way has led to problematic anti-DOD patterns.
2024-10-26link/Elf.zig: ensure capacity before appending linker args.Xavier Bouchoux
fixes e567abb339e1edaf5a3c86fe632522a3b8005275 "rework linker inputs" closes https://github.com/ziglang/zig/issues/21801
2024-10-23mutex protect comp.arena in --verbose-linkAndrew Kelley
2024-10-23glibc sometimes makes archives be ld scriptsAndrew Kelley
it is incredible how many bad ideas glibc is bundled into one project.
2024-10-23link.Elf: unstable sort for section headersAndrew Kelley
using name as tie-breaker.
2024-10-23link.Elf: remove ZigObject from filesAndrew Kelley
By making it a field of link.Elf, it is now accessible without a data dependency on `files`, fixing a race condition with the codegen thread and linker thread.
2024-10-23unify parsing codepaths between relocatable and nonAndrew Kelley
2024-10-23branch fixesAndrew Kelley
2024-10-23move linker input file parsing to the compilation pipelineAndrew Kelley
2024-10-23link.Elf: untangle parseObject and parseArchiveAndrew Kelley
from link.Elf, so that they can be used earlier in the pipeline
2024-10-23link.Elf: refactor output mode checkingAndrew Kelley
2024-10-23link.Elf: fix double free of header in parseDsoAndrew Kelley
2024-10-23rework linker inputsAndrew Kelley
* Compilation.objects changes to Compilation.link_inputs which stores objects, archives, windows resources, shared objects, and strings intended to be put directly into the dynamic section. Order is now preserved between all of these kinds of linker inputs. If it is determined the order does not matter for a particular kind of linker input, that item should be moved to a different array. * rename system_libs to windows_libs * untangle library lookup from CLI types * when doing library lookup, instead of using access syscalls, go ahead and open the files and keep the handles around for passing to the cache system and the linker. * during library lookup and cache file hashing, use positioned reads to avoid affecting the file seek position. * library directories are opened in the CLI and converted to Directory objects, warnings emitted for those that cannot be opened.
2024-10-23introduce a CLI flag to enable .so scripts; default offAndrew Kelley
The compiler defaults this value to off so that users whose system shared libraries are all ELF files don't have to pay the cost of checking every file to find out if it is a text file instead. When a GNU ld script is encountered, the error message instructs users about the CLI flag that will immediately solve their problem.
2024-10-23move ld script processing to the frontendAndrew Kelley
along with the relevant logic, making the libraries within subject to the same search criteria as all the other libraries. this unfortunately means doing file system access on all .so files when targeting ELF to determine if they are linker scripts, however, I have a plan to address this.
2024-10-23move link.Elf.LdScript to link.LdScriptAndrew Kelley
2024-10-23link.Elf.LdScript: eliminate dependency on Elf.FileAndrew Kelley
this allows it to be used by the frontend
2024-10-16std.Target: Remove isBpfFreestanding().Alex Rønne Petersen
The only use of this has nothing to do with the OS tag.
2024-10-12link.Elf: eliminate an O(N^2) algorithm in flush()Andrew Kelley
Make shared_objects a StringArrayHashMap so that deduping does not need to happen in flush. That deduping code also was using an O(N^2) algorithm, which is not allowed in this codebase. There is another violation of this rule in resolveSymbols but this commit does not address it. This required reworking shared object parsing, breaking it into independent components so that we could access soname earlier. Shared object parsing had a few problems that I noticed and fixed in this commit: * Many instances of incorrect use of align(1). * `shnum * @sizeOf(elf.Elf64_Shdr)` can overflow based on user data. * `@divExact` can cause illegal behavior based on user data. * Strange versyms logic that wasn't present in mold nor lld. The logic was not commented and there is no git blame information in ziglang/zig nor kubkon/zld. I changed it to match mold and lld instead. * Use of ArrayList for slices of memory that are never resized. * finding DT_VERDEFNUM in a different loop than finding DT_SONAME. Ultimately I think we should follow mold's lead and ignore this integer, relying on null termination instead. * Doing logic based on VER_FLG_BASE rather than ignoring it like mold and LLD do. No comment explaining why the behavior is different. * Mutating the original ELF symbols rather than only storing the mangled name on the new Symbol struct. I noticed something that I didn't try to address in this commit: Symbol stores a lot of redundant information that is already present in the ELF symbols. I suspect that the codebase could benefit from reworking Symbol to not store redundant information. Additionally: * Add some type safety to std.elf. * Eliminate 1-3 file system reads for determining the kind of input files, by taking advantage of file name extension and handling error codes properly. * Move more error handling methods to link.Diags and make them infallible and thread-safe * Make the data dependencies obvious in the parameters of parseSharedObject. It's now clear that the first two steps (Header and Parsed) can be done during the main Compilation pipeline, rather than waiting for flush().
2024-10-11link: consolidate diagnosticsAndrew Kelley
By organizing linker diagnostics into this struct, it becomes possible to share more code between linker backends, and more importantly it becomes possible to pass only the Diag struct to some functions, rather than passing the entire linker state object in. This makes data dependencies more obvious, making it easier to rearrange code and to multithread. Also fix MachO code abusing an atomic variable. Not only was it using the wrong atomic operation, it is unnecessary additional state since the state is already being protected by a mutex.
2024-10-11work around C backend bugAndrew Kelley
2024-10-11link.Elf.sortShdrs: tease out data dependenciesAndrew Kelley
In order to reduce the logic that happens in flush() we need to see which data is being accessed by all this logic, so we can see which operations depend on each other.
2024-10-11link.Elf: fix merge sections namespacingAndrew Kelley
`link.Elf.merge_section.MergeSection` -> `link.Elf.Merge.Section`
2024-10-11link.Elf: group section indexesAndrew Kelley
so they cannot be forgotten when updating them after sorting them.
2024-10-11link.Elf: fix phdr_gnu_stack_index not included in sortPhdrsAndrew Kelley
Adds type safety for program header indexes. Reduce the amount of state sortPhdrs has access to, helping make the data dependencies clear.
2024-10-10link: fix false positive crtbegin/crtend detectionAndrew Kelley
Embrace the Path abstraction, doing more operations based on directory handles rather than absolute file paths. Most of the diff noise here comes from this one. Fix sorting of crtbegin/crtend atoms. Previously it would look at all path components for those strings. Make the C runtime path detection partially a pure function, and move some logic to glibc.zig where it belongs.
2024-10-09elf: clean up how we create un-allocated sectionsJakub Konka
2024-10-09elf: change how we manage debug atoms in Dwarf linkerJakub Konka
2024-10-09elf: do not create atoms for section symbols that do not require itJakub Konka
2024-10-09elf: move setting section size back to Elf.growSectionJakub Konka
2024-10-09elf: drastically simplify extracting section size logicJakub Konka
2024-10-09elf: clear dynamic relocs before resolving relocs in atomsJakub Konka
When resolving and writing atoms to file, we may add dynamic relocs to the output buffer so clear the buffers before that happens.
2024-10-09elf: add some extra logging for created dynamic relocsJakub Konka
2024-10-09elf: combine growAllocSection and growNonAllocSection into growSectionJakub Konka
2024-10-09elf: move sections in segments that need moving onlyJakub Konka
2024-10-09elf: mark objects as dirty/not-dirtyJakub Konka
This way we can track if we need to redo the object parsing or not.