aboutsummaryrefslogtreecommitdiff
path: root/src/Air
AgeCommit message (Collapse)Author
2025-11-20update deprecated ArrayListUnmanaged usage (#25958)Benjamin Jurk
2025-11-16Dedupe types when printing error messagesProkop Randáček
2025-11-15Legalize: implement soft-float legalizationsMatthew Lugg
A new `Legalize.Feature` tag is introduced for each float bit width (16/32/64/80/128). When e.g. `soft_f16` is enabled, all arithmetic and comparison operations on `f16` are converted to calls to the appropriate compiler_rt function using the new AIR tag `.legalize_compiler_rt_call`. This includes casts where the source *or* target type is `f16`, or integer<=>float conversions to or from `f16`. Occasionally, operations are legalized to blocks because there is extra code required; for instance, legalizing `@floatFromInt` where the integer type is larger than 64 bits requires calling an arbitrary-width integer conversion function which accepts a pointer to the integer, so we need to use `alloc` to create such a pointer, and store the integer there (after possibly zero-extending or sign-extending it). No backend currently uses these new legalizations (and as such, no backend currently needs to implement `.legalize_compiler_rt_call`). However, for testing purposes, I tried modifying the self-hosted x86_64 backend to enable all of the soft-float features (and implement the AIR instruction). This modified backend was able to pass all of the behavior tests (except for one `@mod` test where the LLVM backend has a bug resulting in incorrect compiler-rt behavior!), including the tests specific to the self-hosted x86_64 backend. `f16` and `f80` legalizations are likely of particular interest to backend developers, because most architectures do not have instructions to operate on these types. However, enabling *all* of these legalization passes can be useful when developing a new backend to hit the ground running and pass a good amount of tests more easily.
2025-11-12Air.Legalize: revert to loops for scalarizationsMatthew Lugg
I had tried unrolling the loops to avoid requiring the `vector_store_elem` instruction, but it's arguably a problem to generate O(N) code for an operation on `@Vector(N, T)`. In addition, that lowering emitted a lot of `.aggregate_init` instructions, which is itself a quite difficult operation to codegen. This requires reintroducing runtime vector indexing internally. However, I've put it in a couple of instructions which are intended only for use by `Air.Legalize`, named `legalize_vec_elem_val` (like `array_elem_val`, but for indexing a vector with a runtime-known index) and `legalize_vec_store_elem` (like the old `vector_store_elem` instruction). These are explicitly documented as *not* being emitted by Sema, so need only be implemented by backends if they actually use an `Air.Legalize.Feature` which emits them (otherwise they can be marked as `unreachable`).
2025-11-12compiler: spring cleaningMatthew Lugg
I started this diff trying to remove a little dead code from the C backend, but ended up finding a bunch of dead code sprinkled all over the place: * `packed` handling in the C backend which was made dead by `Legalize` * Representation of pointers to runtime-known vector indices * Handling for the `vector_store_elem` AIR instruction (now removed) * Old tuple handling from when they used the InternPool repr of structs * Straightforward unused functions * TODOs in the LLVM backend for features which Zig just does not support
2025-11-12Legalize: rewrite several legalizationsMatthew Lugg
The main goal of this change was to avoid emitting the `vector_store_elem` AIR tag, because this represents an operation which Zig no longer supports (and hence Sema no longer emits) as of 010d9a6 (because runtime vector indices are now forbidden). Backends should not need to lower this operation, so I rewrote the legalizations which emitted it (scalarizations of vector operations) to instead unroll the loop and hence emit comptime-known vector indices. In doing this, I actually reworked those legalizations to use a different strategy; instead of using an `alloc` and storing to individual vector elements, the vector is constructed by-val, for instance by performing the scalar operation on all elements and passing them to an `aggregate_init`. This is vastly simpler to implement in Legalize, conceptually simpler, and doesn't severely pessimise memory usage, because a non-optimizing backend will store the full vector on the stack either way. Given the above rationale, I also ended up reworking several other legalizations to use simpler lowerings. The legalizations in question were bitcast scalarization, `struct_field_val` of `packed struct`s (where we just bitcast to an integer and perform the appropriate shift/trunc sequence), and `aggregate_init` of a `packed struct` (also implemented in terms of integer bitwise operations with bitcasts to and from the actual types). This hugely simplified some parts of `Legalize`. So, `Legalize` is now much simpler, and the `vector_store_elem` instruction is no longer emitted by any part of the compiler so can be removed in a future commit.
2025-10-30std.debug.lockStderrWriter: also return ttyconfMatthew Lugg
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving syscalls), and doing it every time we need to print isn't really necessary; under normal usage, we can compute the value once and cache it for the whole program's execution. Since anyone outputting to stderr may reasonably want this information (in fact they are very likely to), it makes sense to cache it and return it from `lockStderrWriter`. Call sites who do not need it will experience no significant overhead, and can just ignore the TTY config with a `const w, _` destructure.
2025-09-20frontend: packed struct field ptr no longer finds byte bordersAndrew Kelley
technically breaking, but I doubt anyone will notice.
2025-09-20llvm backend: remove canElideLoad mechanismAndrew Kelley
2025-08-29std.Io: delete GenericReaderAndrew Kelley
and delete deprecated alias std.io
2025-08-12replace even more aggregate internsJustus Klausecker
2025-07-16inline assembly: use typesAndrew Kelley
until now these were stringly typed. it's kinda obvious when you think about it.
2025-07-07std.fmt: fully remove format string from format methodsAndrew Kelley
Introduces `std.fmt.alt` which is a helper for calling alternate format methods besides one named "format".
2025-07-07compiler: update a bunch of format stringsAndrew Kelley
2025-07-07compiler: fix a bunch of format stringsAndrew Kelley
2025-07-07std.zig.llvm.Builder: update format APIAndrew Kelley
2025-07-07compiler: upgrade various std.io API usageAndrew Kelley
2025-07-07std.fmt: breaking API changesAndrew Kelley
added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> *std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) * std.fmt.Formatted - now takes context type explicitly - no fmt string
2025-07-07std.io: move getStdIn, getStdOut, getStdErr functions to fs.FileAndrew Kelley
preparing to rearrange std.io namespace into an interface how to upgrade: std.io.getStdIn() -> std.fs.File.stdin() std.io.getStdOut() -> std.fs.File.stdout() std.io.getStdErr() -> std.fs.File.stderr()
2025-06-19x86_64: increase passing test coverage on windowsJacob Young
Now that codegen has no references to linker state this is much easier. Closes #24153
2025-06-15compiler: fix `@intFromFloat` safety checkmlugg
This safety check was completely broken; it triggered unchecked illegal behavior *in order to implement the safety check*. You definitely can't do that! Instead, we must explicitly check the boundaries. This is a tiny bit fiddly, because we need to make sure we do floating-point rounding in the correct direction, and also handle the fact that the operation truncates so the boundary works differently for min vs max. Instead of implementing this safety check in Sema, there are now dedicated AIR instructions for safety-checked intfromfloat (two instructions; which one is used depends on the float mode). Currently, no backend directly implements them; instead, a `Legalize.Feature` is added which expands the safety check, and this feature is enabled for all backends we currently test, including the LLVM backend. The `u0` case is still handled in Sema, because Sema needs to check for that anyway due to the comptime-known result. The old safety check here was also completely broken and has therefore been rewritten. In that case, we just check for 'abs(input) < 1.0'. I've added a bunch of test coverage for the boundary cases of `@intFromFloat`, both for successes (in `test/behavior/cast.zig`) and failures (in `test/cases/safety/`). Resolves: #24161
2025-06-15Legalize: make the feature set comptime-known in zig1Jacob Young
This allows legalizations to be added that aren't used by zig1 without affecting the size of zig1.
2025-06-12x86_64: remove air references from mirJacob Young
2025-06-06x86_64: add support for pie executablesJacob Young
2025-06-03Legalize: handle packed semanticsJacob Young
Closes #22915
2025-06-01compiler: combine `@intCast` safety checksmlugg
`castTruncatedData` was a poorly worded error (all shrinking casts "truncate bits", it's just that we assume those bits to be zext/sext of the other bits!), and `negativeToUnsigned` was a pointless distinction which forced the compiler to emit worse code (since two separate safety checks were required for casting e.g. 'i32' to 'u16') and wasn't even implemented correctly. This commit combines those safety panics into one function, `integerOutOfBounds`. The name maybe isn't perfect, but that's not hugely important; what matters is the new default message, which is clearer than the old ones: "integer does not fit in destination type".
2025-06-01Legalize: implement scalarization of overflow intrinsicsJacob Young
2025-06-01Legalize: implement scalarization of `@shuffle`Jacob Young
2025-06-01compiler: implement better shuffle AIRmlugg
Runtime `@shuffle` has two cases which backends generally want to handle differently for efficiency: * One runtime vector operand; some result elements may be comptime-known * Two runtime vector operands; some result elements may be undefined The latter case happens if both vectors given to `@shuffle` are runtime-known and they are both used (i.e. the mask refers to them). Otherwise, if the result is not entirely comptime-known, we are in the former case. `Sema` now diffentiates these two cases in the AIR so that backends can easily handle them however they want to. Note that this *doesn't* really involve Sema doing any more work than it would otherwise need to, so there's not really a negative here! Most existing backends have their lowerings for `@shuffle` migrated in this commit. The LLVM backend uses new lowerings suggested by Jacob as ones which it will handle effectively. The x86_64 backend has not yet been migrated; for now there's a panic in there. Jacob will implement that before this is merged anywhere.
2025-06-01Legalize: implement scalarization of `@select`Jacob Young
2025-06-01Legalize: update for new Block APIJacob Young
2025-06-01Legalize: replace `safety_checked_instructions`mlugg
This adds 4 `Legalize.Feature`s: * `expand_intcast_safe` * `expand_add_safe` * `expand_sub_safe` * `expand_mul_safe` These do pretty much what they say on the tin. This logic was previously in Sema, used when `Zcu.Feature.safety_checked_instructions` was not supported by the backend. That `Zcu.Feature` has been removed in favour of this legalization.
2025-05-31cbe: implement `stdbool.h` reserved identifiersJacob Young
Also remove the legalize pass from zig1.
2025-05-31Sema: remove `all_vector_instructions` logicJacob Young
Backends can instead ask legalization on a per-instruction basis.
2025-05-31Legalize: implement scalarization of binary operationsJacob Young
2025-05-31Legalize: implement scalarization of unary operationsJacob Young
2025-05-29Legalize: introduce a new pass before livenessJacob Young
Each target can opt into different sets of legalize features. By performing these transformations before liveness, instructions that become unreferenced will have up-to-date liveness information.
2025-05-27compiler: tlv pointers are not comptime-knownmlugg
Pointers to thread-local variables do not have their addresses known until runtime, so it is nonsensical for them to be comptime-known. There was logic in the compiler which was essentially attempting to treat them as not being comptime-known despite the pointer being an interned value. This was a bit of a mess, the check was frequent enough to actually show up in compiler profiles, and it was very awkward for backends to deal with, because they had to grapple with the fact that a "constant" they were lowering might actually require runtime operations. So, instead, do not consider these pointers to be comptime-known in *any* way. Never intern such a pointer; instead, when the address of a threadlocal is taken, emit an AIR instruction which computes the pointer at runtime. This avoids lots of special handling for TLVs across basically all codegen backends; of all somewhat-functional backends, the only one which wasn't improved by this change was the LLVM backend, because LLVM pretends this complexity around threadlocals doesn't exist. This change simplifies Sema and codegen, avoids a potential source of bugs, and potentially improves Sema performance very slightly by avoiding a non-trivial check on a hot path.
2025-04-26compiler: add @memmove builtindweiller
2025-04-05Dwarf: handle undefined type valuesJacob Young
Closes #23461
2025-01-31Sema: introduce all_vector_instructions backend featureJacob Young
Sema is arbitrarily scalarizing some operations, which means that when I try to implement vectorized versions of those operations in a backend, they are impossible to test due to Sema not producing them. Now, I can implement them and then temporarily enable the new feature for that backend in order to test them. Once the backend supports all of them, the feature can be permanently enabled. This also deletes the Air instructions `int_from_bool` and `int_from_ptr`, which are just bitcasts with a fixed result type, since changing `un_op` to `ty_op` takes up the same amount of memory.
2025-01-30compiler: add `intcast_safe` AIR instructionmlugg
This instruction is like `intcast`, but includes two safety checks: * Checks that the int is in range of the destination type * If the destination type is an exhaustive enum, checks that the int is a named enum value This instruction is locked behind the `safety_checked_instructions` backend feature; if unsupported, Sema will emit a fallback, as with other safety-checked instructions. This instruction is used to add a missing safety check for `@enumFromInt` truncating bits. This check also has a fallback for backends which do not yet support `safety_checked_instructions`. Resolves: #21946
2025-01-21compiler: simplify generic functions, fix issues with inline callsmlugg
The original motivation here was to fix regressions caused by #22414. However, while working on this, I ended up discussing a language simplification with Andrew, which changes things a little from how they worked before #22414. The main user-facing change here is that any reference to a prior function parameter, even if potentially comptime-known at the usage site or even not analyzed, now makes a function generic. This applies even if the parameter being referenced is not a `comptime` parameter, since it could still be populated when performing an inline call. This is a breaking language change. The detection of this is done in AstGen; when evaluating a parameter type or return type, we track whether it referenced any prior parameter, and if so, we mark this type as being "generic" in ZIR. This will cause Sema to not evaluate it until the time of instantiation or inline call. A lovely consequence of this from an implementation perspective is that it eliminates the need for most of the "generic poison" system. In particular, `error.GenericPoison` is now completely unnecessary, because we identify generic expressions earlier in the pipeline; this simplifies the compiler and avoids redundant work. This also entirely eliminates the concept of the "generic poison value". The only remnant of this system is the "generic poison type" (`Type.generic_poison` and `InternPool.Index.generic_poison_type`). This type is used in two places: * During semantic analysis, to represent an unknown result type. * When storing generic function types, to represent a generic parameter/return type. It's possible that these use cases should instead use `.none`, but I leave that investigation to a future adventurer. One last thing. Prior to #22414, inline calls were a little inefficient, because they re-evaluated even non-generic parameter types whenever they were called. Changing this behavior is what ultimately led to #22538. Well, because the new logic will mark a type expression as generic if there is any change its resolved type could differ in an inline call, this redundant work is unnecessary! So, this is another way in which the new design reduces redundant work and complexity. Resolves: #22494 Resolves: #22532 Resolves: #22538
2024-11-24dwarf: fix stepping through an inline loop containing one statementJacob Young
Previously, stepping from the single statement within the loop would always exit the loop because all of the code unrolled from the loop is associated with the same line and treated by the debugger as one line.
2024-10-31compiler: remove anonymous struct types, unify all tuplesmlugg
This commit reworks how anonymous struct literals and tuples work. Previously, an untyped anonymous struct literal (e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type", which is a special kind of struct which coerces using structural equivalence. This mechanism was a holdover from before we used RLS / result types as the primary mechanism of type inference. This commit changes the language so that the type assigned here is a "normal" struct type. It uses a form of equivalence based on the AST node and the type's structure, much like a reified (`@Type`) type. Additionally, tuples have been simplified. The distinction between "simple" and "complex" tuple types is eliminated. All tuples, even those explicitly declared using `struct { ... }` syntax, use structural equivalence, and do not undergo staged type resolution. Tuples are very restricted: they cannot have non-`auto` layouts, cannot have aligned fields, and cannot have default values with the exception of `comptime` fields. Tuples currently do not have optimized layout, but this can be changed in the future. This change simplifies the language, and fixes some problematic coercions through pointers which led to unintuitive behavior. Resolves: #16865
2024-10-04remove `@fence` (#21585)David Rubin
closes #11650
2024-09-01compiler: implement labeled switch/continuemlugg
2024-09-01Air: add explicit `repeat` instruction to repeat loopsmlugg
This commit introduces a new AIR instruction, `repeat`, which causes control flow to move back to the start of a given AIR loop. `loop` instructions will no longer automatically perform this operation after control flow reaches the end of the body. The motivation for making this change now was really just consistency with the upcoming implementation of #8220: it wouldn't make sense to have this feature work significantly differently. However, there were already some TODOs kicking around which wanted this feature. It's useful for two key reasons: * It allows loops over AIR instruction bodies to loop precisely until they reach a `noreturn` instruction. This allows for tail calling a few things, and avoiding a range check on each iteration of a hot path, plus gives a nice assertion that validates AIR structure a little. This is a very minor benefit, which this commit does apply to the LLVM and C backends. * It should allow for more compact ZIR and AIR to be emitted by having AstGen emit `repeat` instructions more often rather than having `continue` statements `break` to a `block` which is *followed* by a `repeat`. This is done in status quo because `repeat` instructions only ever cause the direct parent block to repeat. Now that AIR is more flexible, this flexibility can be pretty trivially extended to ZIR, and we can then emit better ZIR. This commit does not implement this. Support for this feature is currently regressed on all self-hosted native backends, including x86_64. This support will be added where necessary before this branch is merged.
2024-09-01Air: direct representation of ranges in switch casesmlugg
This commit modifies the representation of the AIR `switch_br` instruction to represent ranges in cases. Previously, Sema emitted different AIR in the case of a range, where the `else` branch of the `switch_br` contained a simple `cond_br` for each such case which did a simple range check (`x > a and x < b`). Not only does this add complexity to Sema, which we would like to minimize, but it also gets in the way of the implementation of #8220. That proposal turns certain `switch` statements into a looping construct, and for optimization purposes, we want to lower this to AIR fairly directly (i.e. without involving a `loop` instruction). That means we would ideally like a single instruction to represent the entire `switch` statement, so that we can dispatch back to it with a different operand as in #8220. This is not really possible to do correctly under the status quo system. This commit implements lowering of this new `switch_br` usage in the LLVM and C backends. The C backend just turns any case containing ranges entirely into conditionals, as before. The LLVM backend is a little smarter, and puts scalar items into the `switch` instruction, only using conditionals for the range cases (which direct to the same bb). All remaining self-hosted backends are temporarily regressed in the presence of switch range cases. This functionality will be restored for at least the x86_64 backend before merge.
2024-08-28std: update `std.builtin.Type` fields to follow naming conventionsmlugg
The compiler actually doesn't need any functional changes for this: Sema does reification based on the tag indices of `std.builtin.Type` already! So, no zig1.wasm update is necessary. This change is necessary to disallow name clashes between fields and decls on a type, which is a prerequisite of #9938.