aboutsummaryrefslogtreecommitdiff
path: root/src/arch/wasm/CodeGen.zig
AgeCommit message (Collapse)Author
2025-01-15wasm linker: don't assume nav callees are fully resolvedAndrew Kelley
codegen can be called which contains calls to navs which have only their type resolved. this means the indirect function table needs to track nav indexes not ip indexes.
2025-01-15resolve merge conflictsAndrew Kelley
with 497592c9b45a94fb7b6028bf45b80f183e395a9b
2025-01-15use fixed writer in more placesAndrew Kelley
2025-01-15wasm linker: implement indirect function callsAndrew Kelley
2025-01-15wasm linker: avoid recursion in lowerZcuDataAndrew Kelley
instead of recursion, callers of the function are responsible for checking the respective tables that might have new entries in them and then calling lowerZcuData again.
2025-01-15wasm codegen: fix call_indirectAndrew Kelley
2025-01-15wasm codegen: fix extra index not relativeAndrew Kelley
2025-01-15wasm codegen: fix wrong union field for localsAndrew Kelley
2025-01-15wasm linker: implement missing logicAndrew Kelley
fix some compilation errors for reworked Emit now that it's actually referenced introduce DataSegment.Id for sorting data both from object files and from the Zcu. introduce optimization: data segment sorting includes a descending sort on reference count so that references to data can be smaller integers leading to better LEB encodings. this optimization is skipped for object files. implement uav address access function which is based on only 1 hash table lookup to find out the offset after sorting.
2025-01-15wasm codegen: remove dependency on PerThread where possibleAndrew Kelley
2025-01-15wasm codegen: fix lowering of 32/64 float rt callsAndrew Kelley
2025-01-15remove bad deinitAndrew Kelley
2025-01-15wasm linker: implement name, module name, and type for function importsAndrew Kelley
2025-01-15wasm: use call_intrinsic MIR instructionAndrew Kelley
2025-01-15wasm: move error_name lowering to Emit phaseAndrew Kelley
2025-01-15wasm codegen: rename func: CodeGen to cg: CodeGenAndrew Kelley
2025-01-15wasm codegen: switch on bool instead of intAndrew Kelley
2025-01-15wasm: implement errors_len as a MIR opcode with no linker involvementAndrew Kelley
2025-01-15wasm codegen: fix some compilation errorsAndrew Kelley
2025-01-15rewrite wasm/Emit.zigAndrew Kelley
mainly, rework how relocations works. This is the point at which symbol indexes are known - not before. And don't emit unnecessary relocations! They're only needed when emitting an object file. Changes wasm linker to keep MIR around long-lived so that fixups can be reapplied after linker garbage collection. use labeled switch while we're at it
2025-01-15compiler: add type safety for export indicesAndrew Kelley
2025-01-15wasm linker: aggressive DODificationAndrew Kelley
The goals of this branch are to: * compile faster when using the wasm linker and backend * enable saving compiler state by directly copying in-memory linker state to disk. * more efficient compiler memory utilization * introduce integer type safety to wasm linker code * generate better WebAssembly code * fully participate in incremental compilation * do as much work as possible outside of flush(), while continuing to do linker garbage collection. * avoid unnecessary heap allocations * avoid unnecessary indirect function calls In order to accomplish this goals, this removes the ZigObject abstraction, as well as Symbol and Atom. These abstractions resulted in overly generic code, doing unnecessary work, and needless complications that simply go away by creating a better in-memory data model and emitting more things lazily. For example, this makes wasm codegen emit MIR which is then lowered to wasm code during linking, with optimal function indexes etc, or relocations are emitted if outputting an object. Previously, this would always emit relocations, which are fully unnecessary when emitting an executable, and required all function calls to use the maximum size LEB encoding. This branch introduces the concept of the "prelink" phase which occurs after all object files have been parsed, but before any Zcu updates are sent to the linker. This allows the linker to fully parse all objects into a compact memory model, which is guaranteed to be complete when Zcu code is generated. This commit is not a complete implementation of all these goals; it is not even passing semantic analysis.
2024-12-24compiler: analyze type and value of global declaration separatelymlugg
This commit separates semantic analysis of the annotated type vs value of a global declaration, therefore allowing recursive and mutually recursive values to be declared. Every `Nav` which undergoes analysis now has *two* corresponding `AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val` unit is responsible for *fully resolving* the `Nav`: determining its value, linksection, addrspace, etc. The `nav_ty` unit, on the other hand, resolves only the information necessary to construct a *pointer* to the `Nav`: its type, addrspace, etc. (It does also analyze its linksection, but that could be moved to `nav_val` I think; it doesn't make any difference). Analyzing a `nav_ty` for a declaration with no type annotation will just mark a dependency on the `nav_val`, analyze it, and finish. Conversely, analyzing a `nav_val` for a declaration *with* a type annotation will first mark a dependency on the `nav_ty` and analyze it, using this as the result type when evaluating the value body. The `nav_val` and `nav_ty` units always have references to one another: so, if a `Nav`'s type is referenced, its value implicitly is too, and vice versa. However, these dependencies are trivial, so, to save memory, are only known implicitly by logic in `resolveReferences`. In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the corresponding `Nav`. There are two exceptions to this. If the declaration is an `extern` declaration, then we immediately ensure the `Nav` value is resolved (which doesn't actually require any more analysis, since such a declaration has no value body anyway). Additionally, if the resolved type has type tag `.@"fn"`, we again immediately resolve the `Nav` value. The latter restriction is in place for two reasons: * Functions are special, in that their externs are allowed to trivially alias; i.e. with a declaration `extern fn foo(...)`, you can write `const bar = foo;`. This is not allowed for non-function externs, and it means that function types are the only place where it is possible for a declaration `Nav` to have a `.@"extern"` value without actually being declared `extern`. We need to identify this situation immediately so that the `decl_ref` can create a pointer to the *real* extern `Nav`, not this alias. * In certain situations, such as taking a pointer to a `Nav`, Sema needs to queue analysis of a runtime function if the value is a function. To do this, the function value needs to be known, so we need to resolve the value immediately upon `&foo` where `foo` is a function. This restriction is simple to codify into the eventual language specification, and doesn't limit the utility of this feature in practice. A consequence of this commit is that codegen and linking logic needs to be more careful when looking at `Nav`s. In general: * When `updateNav` or `updateFunc` is called, it is safe to assume that the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully resolved. * Any `Nav` whose value is/will be an `@"extern"` or a function is fully resolved; see `Nav.getExtern` for a helper for a common case here. * Any other `Nav` may only have its type resolved. This didn't seem to be too tricky to satisfy in any of the existing codegen/linker backends. Resolves: #131
2024-11-24dwarf: fix stepping through an inline loop containing one statementJacob Young
Previously, stepping from the single statement within the loop would always exit the loop because all of the code unrolled from the loop is associated with the same line and treated by the debugger as one line.
2024-10-31compiler: remove anonymous struct types, unify all tuplesmlugg
This commit reworks how anonymous struct literals and tuples work. Previously, an untyped anonymous struct literal (e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type", which is a special kind of struct which coerces using structural equivalence. This mechanism was a holdover from before we used RLS / result types as the primary mechanism of type inference. This commit changes the language so that the type assigned here is a "normal" struct type. It uses a form of equivalence based on the AST node and the type's structure, much like a reified (`@Type`) type. Additionally, tuples have been simplified. The distinction between "simple" and "complex" tuple types is eliminated. All tuples, even those explicitly declared using `struct { ... }` syntax, use structural equivalence, and do not undergo staged type resolution. Tuples are very restricted: they cannot have non-`auto` layouts, cannot have aligned fields, and cannot have default values with the exception of `comptime` fields. Tuples currently do not have optimized layout, but this can be changed in the future. This change simplifies the language, and fixes some problematic coercions through pointers which led to unintuitive behavior. Resolves: #16865
2024-10-30link.File.Wasm: remove the "files" abstractionAndrew Kelley
Removes the `files` field from the Wasm linker, storing the ZigObject as its own field instead using a tagged union. This removes a layer of indirection when accessing the ZigObject, and untangles logic so that we can introduce a "pre-link" phase that prepares the linker state to handle only incremental updates to the ZigObject and then minimize logic inside flush(). Furthermore, don't make array elements store their own indexes, that's always a waste. Flattens some of the file system hierarchy and unifies variable names for easier refactoring. Introduces type safety for optional object indexes.
2024-10-19compiler: remove @setAlignStackmlugg
This commit finishes implementing #21209 by removing the `@setAlignStack` builtin in favour of `CallingConvention` payloads. The x86_64 backend is updated to use the stack alignment given in the calling convention (the LLVM backend was already updated in a previous commit). Resolves: #21209
2024-10-19std: update for new `CallingConvention`mlugg
The old `CallingConvention` type is replaced with the new `NewCallingConvention`. References to `NewCallingConvention` in the compiler are updated accordingly. In addition, a few parts of the standard library are updated to use the new type correctly.
2024-10-19compiler: introduce new `CallingConvention`mlugg
This commit begins implementing accepted proposal #21209 by making `std.builtin.CallingConvention` a tagged union. The stage1 dance here is a little convoluted. This commit introduces the new type as `NewCallingConvention`, keeping the old `CallingConvention` around. The compiler uses `std.builtin.NewCallingConvention` exclusively, but when fetching the type from `std` when running the compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is used. This allows a prior build of Zig to be used to build this commit. The next commit will update `zig1.wasm`, and then the compiler and standard library can be updated to completely replace `CallingConvention` with `NewCallingConvention`. The second half of #21209 is to remove `@setAlignStack`, which will be implemented in another commit after updating `zig1.wasm`.
2024-10-08stage2-wasm: airRem + airMod for floatsPavel Verigo
2024-10-04remove `@fence` (#21585)David Rubin
closes #11650
2024-09-12Replace deprecated default initializations with decl literalsLinus Groh
2024-09-10codegen: implement output to the `.debug_info` sectionJacob Young
2024-09-01wasm: un-regress `loop` and `switch_br`mlugg
`.loop` is also a block, so the block_depth must be stored *after* block creation, ensuring a correct block_depth to jump back to when receiving `.repeat`. This also un-regresses `switch_br` which now correctly handles ranges within cases. It supports it for both jump tables as well as regular conditional branches.
2024-09-01compiler: implement labeled switch/continuemlugg
2024-09-01Air: add explicit `repeat` instruction to repeat loopsmlugg
This commit introduces a new AIR instruction, `repeat`, which causes control flow to move back to the start of a given AIR loop. `loop` instructions will no longer automatically perform this operation after control flow reaches the end of the body. The motivation for making this change now was really just consistency with the upcoming implementation of #8220: it wouldn't make sense to have this feature work significantly differently. However, there were already some TODOs kicking around which wanted this feature. It's useful for two key reasons: * It allows loops over AIR instruction bodies to loop precisely until they reach a `noreturn` instruction. This allows for tail calling a few things, and avoiding a range check on each iteration of a hot path, plus gives a nice assertion that validates AIR structure a little. This is a very minor benefit, which this commit does apply to the LLVM and C backends. * It should allow for more compact ZIR and AIR to be emitted by having AstGen emit `repeat` instructions more often rather than having `continue` statements `break` to a `block` which is *followed* by a `repeat`. This is done in status quo because `repeat` instructions only ever cause the direct parent block to repeat. Now that AIR is more flexible, this flexibility can be pretty trivially extended to ZIR, and we can then emit better ZIR. This commit does not implement this. Support for this feature is currently regressed on all self-hosted native backends, including x86_64. This support will be added where necessary before this branch is merged.
2024-09-01Air: direct representation of ranges in switch casesmlugg
This commit modifies the representation of the AIR `switch_br` instruction to represent ranges in cases. Previously, Sema emitted different AIR in the case of a range, where the `else` branch of the `switch_br` contained a simple `cond_br` for each such case which did a simple range check (`x > a and x < b`). Not only does this add complexity to Sema, which we would like to minimize, but it also gets in the way of the implementation of #8220. That proposal turns certain `switch` statements into a looping construct, and for optimization purposes, we want to lower this to AIR fairly directly (i.e. without involving a `loop` instruction). That means we would ideally like a single instruction to represent the entire `switch` statement, so that we can dispatch back to it with a different operand as in #8220. This is not really possible to do correctly under the status quo system. This commit implements lowering of this new `switch_br` usage in the LLVM and C backends. The C backend just turns any case containing ranges entirely into conditionals, as before. The LLVM backend is a little smarter, and puts scalar items into the `switch` instruction, only using conditionals for the range cases (which direct to the same bb). All remaining self-hosted backends are temporarily regressed in the presence of switch range cases. This functionality will be restored for at least the x86_64 backend before merge.
2024-08-28std: update `std.builtin.Type` fields to follow naming conventionsmlugg
The compiler actually doesn't need any functional changes for this: Sema does reification based on the tag indices of `std.builtin.Type` already! So, no zig1.wasm update is necessary. This change is necessary to disallow name clashes between fields and decls on a type, which is a prerequisite of #9938.
2024-08-27Dwarf: fix and test string formatJacob Young
2024-08-27compiler: implement `@branchHint`, replacing `@setCold`mlugg
Implements the accepted proposal to introduce `@branchHint`. This builtin is permitted as the first statement of a block if that block is the direct body of any of the following: * a function (*not* a `test`) * either branch of an `if` * the RHS of a `catch` or `orelse` * a `switch` prong * an `or` or `and` expression It lowers to the ZIR instruction `extended(branch_hint(...))`. When Sema encounters this instruction, it sets `sema.branch_hint` appropriately, and `zirCondBr` etc are expected to reset this value as necessary. The state is on `Sema` rather than `Block` to make it automatically propagate up non-conditional blocks without special handling. If `@panic` is reached, the branch hint is set to `.cold` if none was already set; similarly, error branches get a hint of `.unlikely` if no hint is explicitly provided. If a condition is comptime-known, `cold` hints from the taken branch are allowed to propagate up, but other hints are discarded. This is because a `likely`/`unlikely` hint just indicates the direction this branch is likely to go, which is redundant information when the branch is known at comptime; but `cold` hints indicate that control flow is unlikely to ever reach this branch, meaning if the branch is always taken from its parent, then the parent is also unlikely to ever be reached. This branch information is stored in AIR `cond_br` and `switch_br`. In addition, `try` and `try_ptr` instructions have variants `try_cold` and `try_ptr_cold` which indicate that the error case is cold (rather than just unlikely); this is reachable through e.g. `errdefer unreachable` or `errdefer @panic("")`. A new API `unwrapSwitch` is introduced to `Air` to make it more convenient to access `switch_br` instructions. In time, I plan to update all AIR instructions to be accessed via an `unwrap` method which returns a convenient tagged union a la `InternPool.indexToKey`. The LLVM backend lowers branch hints for conditional branches and switches as follows: * If any branch is marked `unpredictable`, the instruction is marked `!unpredictable`. * Any branch which is marked as `cold` gets a `llvm.assume(i1 true) [ "cold"() ]` call to mark the code path cold. * If any branch is marked `likely` or `unlikely`, branch weight metadata is attached with `!prof`. Likely branches get a weight of 2000, and unlikely branches a weight of 1. In `switch` statements, un-annotated branches get a weight of 1000 as a "middle ground" hint, since there could be likely *and* unlikely *and* un-annotated branches. For functions, a `cold` hint corresponds to the `cold` function attribute, and other hints are currently ignored -- as far as I can tell LLVM doesn't really have a way to lower them. (Ideally, we would want the branch hint given in the function to propagate to call sites.) The compiler and standard library do not yet use this new builtin. Resolves: #21148
2024-08-25comp: rename `module` to `zcu`David Rubin
2024-08-25sema: clean-up `{union,struct}FieldAlignment` and friendsDavid Rubin
My main gripes with this design were that it was incorrectly namespaced, the naming was inconsistent and a bit wrong (`fooAlign` vs `fooAlignment`). This commit moves all the logic from `PerThread.zig` to use the zcu + tid system that the previous couple commits introduce. I've organized and merged the functions to be a bit more specific to their own purpose. - `fieldAlignment` takes a struct or union type, an index, and a Zcu (or the Sema version which takes a Pt), and gives you the alignment of the field at the index. - `structFieldAlignment` takes the field type itself, and provides the logic to handle special cases, such as externs. A design goal I had in mind was to avoid using the word 'struct' in the function name, when it worked for things that aren't structs, such as unions.
2024-08-25sema: rework type resolution to use Zcu when possibleDavid Rubin
2024-08-20Dwarf: emit info about inline call sitesJacob Young
2024-08-16Dwarf: rework self-hosted debug info from scratchJacob Young
This is in preparation for incremental and actually being able to debug executables built by the x86_64 backend.
2024-08-11compiler: split Decl into Nav and Caumlugg
The type `Zcu.Decl` in the compiler is problematic: over time it has gained many responsibilities. Every source declaration, container type, generic instantiation, and `@extern` has a `Decl`. The functions of these `Decl`s are in some cases entirely disjoint. After careful analysis, I determined that the two main responsibilities of `Decl` are as follows: * A `Decl` acts as the "subject" of semantic analysis at comptime. A single unit of analysis is either a runtime function body, or a `Decl`. It registers incremental dependencies, tracks analysis errors, etc. * A `Decl` acts as a "global variable": a pointer to it is consistent, and it may be lowered to a specific symbol by the codegen backend. This commit eliminates `Decl` and introduces new types to model these responsibilities: `Cau` (Comptime Analysis Unit) and `Nav` (Named Addressable Value). Every source declaration, and every container type requiring resolution (so *not* including `opaque`), has a `Cau`. For a source declaration, this `Cau` performs the resolution of its value. (When #131 is implemented, it is unsolved whether type and value resolution will share a `Cau` or have two distinct `Cau`s.) For a type, this `Cau` is the context in which type resolution occurs. Every non-`comptime` source declaration, every generic instantiation, and every distinct `extern` has a `Nav`. These are sent to codegen/link: the backends by definition do not care about `Cau`s. This commit has some minor technically-breaking changes surrounding `usingnamespace`. I don't think they'll impact anyone, since the changes are fixes around semantics which were previously inconsistent (the behavior changed depending on hashmap iteration order!). Aside from that, this changeset has no significant user-facing changes. Instead, it is an internal refactor which makes it easier to correctly model the responsibilities of different objects, particularly regarding incremental compilation. The performance impact should be negligible, but I will take measurements before merging this work into `master`. Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com> Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>
2024-07-23stage2-wasm: fix bigint div and truncPavel Verigo
2024-07-23stage2-wasm: mul_sat 32 bits <=, i64, i128Pavel Verigo
2024-07-20Merge pull request #20692 from pavelverigo/stage2-wasm-overflow-opsAndrew Kelley
stage2-wasm: overflow ops improvement
2024-07-20Fix typos in code comments in `src/`sobolevn