aboutsummaryrefslogtreecommitdiff
path: root/src/Compilation.zig
AgeCommit message (Collapse)Author
2025-01-15Compilation: account for C objects and resources in prelinkAndrew Kelley
2025-01-15wasm linker: implement stack pointer globalAndrew Kelley
2025-01-15implement the prelink phase in the frontendAndrew Kelley
this strategy uses a "postponed" queue to handle codegen tasks that spawn too early. there's probably a better way.
2025-01-15wasm linker: allow undefined imports when lib name is providedAndrew Kelley
and expose object_host_name as an option for setting the lib name for object files, since the wasm linking standards don't specify a way to do it.
2025-01-15elf linker: conform to explicit error setsAndrew Kelley
2025-01-15wasm linker: aggressive DODificationAndrew Kelley
The goals of this branch are to: * compile faster when using the wasm linker and backend * enable saving compiler state by directly copying in-memory linker state to disk. * more efficient compiler memory utilization * introduce integer type safety to wasm linker code * generate better WebAssembly code * fully participate in incremental compilation * do as much work as possible outside of flush(), while continuing to do linker garbage collection. * avoid unnecessary heap allocations * avoid unnecessary indirect function calls In order to accomplish this goals, this removes the ZigObject abstraction, as well as Symbol and Atom. These abstractions resulted in overly generic code, doing unnecessary work, and needless complications that simply go away by creating a better in-memory data model and emitting more things lazily. For example, this makes wasm codegen emit MIR which is then lowered to wasm code during linking, with optimal function indexes etc, or relocations are emitted if outputting an object. Previously, this would always emit relocations, which are fully unnecessary when emitting an executable, and required all function calls to use the maximum size LEB encoding. This branch introduces the concept of the "prelink" phase which occurs after all object files have been parsed, but before any Zcu updates are sent to the linker. This allows the linker to fully parse all objects into a compact memory model, which is guaranteed to be complete when Zcu code is generated. This commit is not a complete implementation of all these goals; it is not even passing semantic analysis.
2025-01-10cbe: fix miscomps of the compilerJacob Young
2025-01-06fix win32 manifest ID for DLLsReuben Dunnington
* MSDN documentation page covering what resource IDs manifests should have: https://learn.microsoft.com/en-us/windows/win32/sbscs/using-side-by-side-assemblies-as-a-resource * This change ensures shared libraries that embed win32 manifests use the proper ID of 2 instead of 1, which is only allowed for .exes. If the manifest uses the wrong ID, it will not be found and is essentially ignored.
2025-01-05Added support for thin ltoTravis Lange
2025-01-05link: new incremental line number update APImlugg
2025-01-04incremental: new `AnalUnit` to group dependencies on `std.builtin` declsmlugg
This commit reworks how values like the panic handler function are memoized during a compiler invocation. Previously, the value was resolved by whichever analysis requested it first, and cached on `Zcu`. This is problematic for incremental compilation, as after the initial resolution, no dependencies are marked by users of this memoized state. This is arguably acceptable for `std.builtin`, but it's definitely not acceptable for the panic handler/messages, because those can be set by the user (`std.builtin.Panic` checks `@import("root").Panic`). So, here we introduce a new kind of `AnalUnit`, called `memoized_state`. There are 3 such units: * `.{ .memoized_state = .va_list }` resolves the type `std.builtin.VaList` * `.{ .memoized_state = .panic }` resolves `std.Panic` * `.{ .memoized_state = .main }` resolves everything else we want These units essentially "bundle" the resolution of their corresponding declarations, storing the results into fields on `Zcu`. This way, when, for instance, a function wants to call the panic handler, it simply runs `ensureMemoizedStateResolved`, registering one dependency, and pulls the values from the `Zcu`. This "bundling" minimizes dependency edges. The 3 units are separated to allow them to act independently: for instance, the panic handler can use `std.builtin.Type` without triggering a dependency loop.
2025-01-01incremental: fix errors not being deleted upon re-analysismlugg
Previously, logic in `Compilation.getAllErrorsAlloc` was corrupting the `failed_analysis` hashmap. This meant that on updates after the initial update, attempts to remove entries from this map (because the `AnalUnit` in question is being re-analyzed) silently failed. This resulted in compile errors from earlier updates wrongly getting "stuck", i.e. never being removed. This commit also adds a few log calls which helped me to find this bug.
2024-12-31compiler: ensure result of `block_comptime` is comptime-knownmlugg
To avoid this PR regressing error messages, most of the work here has gone towards improving error notes for why code was comptime-evaluated. ZIR `block_comptime` now stores a "comptime reason", the enum for which is also used by Sema. There are two types in Sema: * `ComptimeReason` represents the reason we started evaluating something at comptime. * `BlockComptimeReason` represents the reason a given block is evaluated at comptime; it's either a `ComptimeReason` with an attached source location, or it's because we're in a function which was called at comptime (and that function's `Block` should be consulted for the "parent" reason). Every `Block` stores a `?BlockComptimeReason`. The old `is_comptime` field is replaced with a trivial `isComptime()` method which returns whether that reason is non-`null`. Lastly, the handling for `block_comptime` has been simplified. It was previously going through an unnecessary runtime-handling path; now, it is a trivial sub block exited through a `break_inline` instruction. Resolves: #22296
2024-12-24compiler: analyze type and value of global declaration separatelymlugg
This commit separates semantic analysis of the annotated type vs value of a global declaration, therefore allowing recursive and mutually recursive values to be declared. Every `Nav` which undergoes analysis now has *two* corresponding `AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val` unit is responsible for *fully resolving* the `Nav`: determining its value, linksection, addrspace, etc. The `nav_ty` unit, on the other hand, resolves only the information necessary to construct a *pointer* to the `Nav`: its type, addrspace, etc. (It does also analyze its linksection, but that could be moved to `nav_val` I think; it doesn't make any difference). Analyzing a `nav_ty` for a declaration with no type annotation will just mark a dependency on the `nav_val`, analyze it, and finish. Conversely, analyzing a `nav_val` for a declaration *with* a type annotation will first mark a dependency on the `nav_ty` and analyze it, using this as the result type when evaluating the value body. The `nav_val` and `nav_ty` units always have references to one another: so, if a `Nav`'s type is referenced, its value implicitly is too, and vice versa. However, these dependencies are trivial, so, to save memory, are only known implicitly by logic in `resolveReferences`. In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the corresponding `Nav`. There are two exceptions to this. If the declaration is an `extern` declaration, then we immediately ensure the `Nav` value is resolved (which doesn't actually require any more analysis, since such a declaration has no value body anyway). Additionally, if the resolved type has type tag `.@"fn"`, we again immediately resolve the `Nav` value. The latter restriction is in place for two reasons: * Functions are special, in that their externs are allowed to trivially alias; i.e. with a declaration `extern fn foo(...)`, you can write `const bar = foo;`. This is not allowed for non-function externs, and it means that function types are the only place where it is possible for a declaration `Nav` to have a `.@"extern"` value without actually being declared `extern`. We need to identify this situation immediately so that the `decl_ref` can create a pointer to the *real* extern `Nav`, not this alias. * In certain situations, such as taking a pointer to a `Nav`, Sema needs to queue analysis of a runtime function if the value is a function. To do this, the function value needs to be known, so we need to resolve the value immediately upon `&foo` where `foo` is a function. This restriction is simple to codify into the eventual language specification, and doesn't limit the utility of this feature in practice. A consequence of this commit is that codegen and linking logic needs to be more careful when looking at `Nav`s. In general: * When `updateNav` or `updateFunc` is called, it is safe to assume that the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully resolved. * Any `Nav` whose value is/will be an `@"extern"` or a function is fully resolved; see `Nav.getExtern` for a helper for a common case here. * Any other `Nav` may only have its type resolved. This didn't seem to be too tricky to satisfy in any of the existing codegen/linker backends. Resolves: #131
2024-12-24compiler: remove Caumlugg
The `Cau` abstraction originated from noting that one of the two primary roles of the legacy `Decl` type was to be the subject of comptime semantic analysis. However, the data stored in `Cau` has always had some level of redundancy. While preparing for #131, I went to remove that redundany, and realised that `Cau` now had exactly one field: `owner`. This led me to conclude that `Cau` is, in fact, an unnecessary level of abstraction over what are in reality *fundamentally different* kinds of analysis unit (`AnalUnit`). Types, `Nav` vals, and `comptime` declarations are all analyzed in different ways, and trying to treat them as the same thing is counterproductive! So, these 3 cases are now different alternatives in `AnalUnit`. To avoid stealing bits from `InternPool`-based IDs, which are already a little starved for bits due to the sharding datastructures, `AnalUnit` is expanded to 64 bits (30 of which are currently unused). This doesn't impact memory usage too much by default, because we don't store `AnalUnit`s all too often; however, we do store them a lot under `-fincremental`, so a non-trivial bump to peak RSS can be observed there. This will be improved in the future when I made `InternPool.DepEntry` less memory-inefficient. `Zcu.PerThread.ensureCauAnalyzed` is split into 3 functions, for each of the 3 new types of `AnalUnit`. The new logic is much easier to understand, because it avoids conflating the logic of these fundamentally different cases.
2024-12-20lldb: add pretty printer for intern pool indicesJacob Young
2024-12-15zig cc: Remove broken CUDA C/C++ support.Alex Rønne Petersen
2024-12-14Compilation: Clean up addCCArgs().Alex Rønne Petersen
The goal of this commit is to get rid of some "unused command line argument" warnings that Clang would give for various file types previously. This cleanup also has the side effect of making the order of flags more understandable, especially as it pertains to include paths. Since a lot of code was shuffled around in this commit, I recommend reviewing the old and new versions of the function side-by-side rather than trying to make sense of the diff.
2024-12-13Compilation: Use Clang dependency file for preprocessed assembly files.Alex Rønne Petersen
2024-12-13Compilation: Use a better canonical file extension for header files.Alex Rønne Petersen
2024-12-13Compilation: Override Clang's language type for header files.Alex Rønne Petersen
Clang seems to treat them as linker input without this.
2024-12-13Compilation: Improve classification of various C/C++/Objective-C files.Alex Rønne Petersen
2024-12-13Merge pull request #22035 from alexrp/unwind-fixesAlex Rønne Petersen
Better unwind table support + unwind protection in `_start()` and `clone()`
2024-12-11std.Build.Cache.hit: work around macOS kernel bugAndrew Kelley
The previous commit cast doubt upon the initial report about macOS kernel behavior, identifying another reason that ENOENT could be returned from file creation. However, it is demonstrable that ENOENT can be returned for both cases: 1. create file race 2. handle refers to deleted directory This commit re-introduces the workaround for the file creation race on macOS however it does not unconditionally retry - it first tries again with O_EXCL to disambiguate the error condition that has occurred.
2024-12-10std.Build.Cache.hit: more discipline in error handlingAndrew Kelley
Previous commits 2b0929929d67e222ca6a9523a3a594ed456c4a51 4ea2f441df36cec61e1017f4d795d4037326c98c had this text: > There are no dir components, so you would think that this was > unreachable, however we have observed on macOS two processes racing to > do openat() with O_CREAT manifest in ENOENT. This appears to have been a misunderstanding based on the issue report #12138 and corresponding PR #12139 in which the steps to reproduce removed the cache directory in a loop which also executed detached Zig compiler processes. There is no evidence for the macOS kernel bug however the ENOENT is easily explained by the removal of the cache directory. This commit reverts those commits, ultimately reporting the ENOENT as an error rather than repeating the create file operation. However this commit also adds an explicit error set to `std.Build.Cache.hit` as well as changing the `failed_file_index` to a proper diagnostic field that fully communicates what failed, leading to more informative error messages on failure to check the cache. The equivalent failure when occuring for AstGen performs a fatal process kill, reasoning being that the compiler has an invariant of the cache directory not being yanked out from underneath it while executing. This could be made a more granular error in the future but I suspect such thing is not valuable to pursue. Related to #18340 but does not solve it.
2024-12-11compiler: Improve the handling of unwind table levels.Alex Rønne Petersen
The goal here is to support both levels of unwind tables (sync and async) in zig cc and zig build. Previously, the LLVM backend always used async tables while zig cc was partially influenced by whatever was Clang's default.
2024-12-10fix unknown file extension with rmetaTravis Lange
2024-12-09Merge pull request #22157 from mlugg/astgen-error-lazyAndrew Kelley
compiler: allow semantic analysis of files with AstGen errors
2024-12-08Compilation: Don't rely on Clang defaults for options that are user-facing.Alex Rønne Petersen
2024-12-05compiler: incremental compilation fixesmlugg
The previous commit exposed some bugs in incremental compilation. This commit fixes those, and adds a little more logging for debugging incremental compilation. Also, allow `ast-check -t` to dump ZIR when there are non-fatal AstGen errors.
2024-12-05Compilation: Consider *.rlib files to be static libraries.Alex Rønne Petersen
These are produced by rustc: https://rustc-dev-guide.rust-lang.org/backend/libs-and-metadata.html#rlib
2024-11-28Merge pull request #22067 from alexrp/pie-testsAlex Rønne Petersen
Add PIC/PIE tests and fix some bugs + some improvements to the test harness
2024-11-26diversify "unable to spawn" failure messagesAndrew Kelley
to help understand where a spurious failure is occurring
2024-11-24std.Target: Add Os.HurdVersionRange for Os.Tag.hurd.Alex Rønne Petersen
This is necessary since isGnuLibC() is true for hurd, so we need to be able to represent a glibc version for it. Also add an Os.TaggedVersionRange.gnuLibCVersion() convenience function.
2024-11-23Compilation: Consider *.lo files to be object files.Alex Rønne Petersen
Fixes musl libc.so compilation with zig cc.
2024-11-13Compilation: Pass -municode on to Clang.Alex Rønne Petersen
This is supposed to define the UNICODE macro; it's not just a linker option. Closes #21978.
2024-11-05musl: Pass -fomit-frame-pointer via CrtFileOptions.Alex Rønne Petersen
2024-11-05musl: Pass -f(function,data)-sections via CrtFileOptions instead of CFLAGS.Alex Rønne Petersen
2024-11-05Compilation: Fix unwind table logic for compiler-rt.Alex Rønne Petersen
This looks to be a refactoring leftover.
2024-11-05Compilation: Also set essential module options when including compiler-rt.o.Alex Rønne Petersen
Closes #21831.
2024-11-05Compilation: Move no_builtin to Package.Module.Alex Rønne Petersen
This option, by its very nature, needs to be attached to a module. If it isn't, the code in a module could break at random when compiled into an application that doesn't have this option set. After this change, skip_linker_dependencies no longer implies no_builtin in the LLVM backend.
2024-11-03glibc: Don't build CRT objects that won't be used.Alex Rønne Petersen
2024-11-03Compilation: Use the regular module mechanism for setting PIC on CRT objects.Alex Rønne Petersen
addCCArgs() will then pass the appropriate flag to Clang.
2024-11-03Compilation: Pass -fno-PIC to clang if PIC is disabled.Alex Rønne Petersen
Let's not implicitly rely on whatever Clang's default is.
2024-11-02Merge pull request #21729 from alexrp/target-cpu-baselineAlex Rønne Petersen
`std.Target.Cpu.Model`: Further refinements to `generic()` and `baseline()`
2024-10-31compiler: remove anonymous struct types, unify all tuplesmlugg
This commit reworks how anonymous struct literals and tuples work. Previously, an untyped anonymous struct literal (e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type", which is a special kind of struct which coerces using structural equivalence. This mechanism was a holdover from before we used RLS / result types as the primary mechanism of type inference. This commit changes the language so that the type assigned here is a "normal" struct type. It uses a form of equivalence based on the AST node and the type's structure, much like a reified (`@Type`) type. Additionally, tuples have been simplified. The distinction between "simple" and "complex" tuple types is eliminated. All tuples, even those explicitly declared using `struct { ... }` syntax, use structural equivalence, and do not undergo staged type resolution. Tuples are very restricted: they cannot have non-`auto` layouts, cannot have aligned fields, and cannot have default values with the exception of `comptime` fields. Tuples currently do not have optimized layout, but this can be changed in the future. This change simplifies the language, and fixes some problematic coercions through pointers which led to unintuitive behavior. Resolves: #16865
2024-10-26Compilation: Omit Clang CPU model flags for some targets.Alex Rønne Petersen
2024-10-24remove leak from linkerDavid Rubin
2024-10-23avoid unnecessarily building Scrt1.o when cross-compiling glibcAndrew Kelley
which, in this branch causes a miscompilation because it would get sent to the linker.
2024-10-23combine codegen work queue and linker task queueAndrew Kelley
these tasks have some shared data dependencies so they cannot be done simultaneously. Future work should untangle these data dependencies so that more can be done in parallel. for now this commit ensures correctness by making linker input parsing and codegen tasks part of the same queue.