aboutsummaryrefslogtreecommitdiff
path: root/src/Sema.zig
AgeCommit message (Collapse)Author
2025-01-24x86_64: rewrite scalar and vector int `@min` and `@max`Jacob Young
2025-01-24x86_64: rewrite float vector `@abs` and equality comparisonsJacob Young
2025-01-24Zcu: remove `null_stack_trace`mlugg
The new simplifications to the panic handler have eliminated the need for this piece of memoized state.
2025-01-24all: update for `panic.unwrapError` and `panic.call` signature changesmlugg
2025-01-24Sema: prepare to remove `?*StackTrace` argument from `unwrapError` and `call`mlugg
Now that we propagate the error return trace to all `callconv(.auto)` functions, passing it explicitly to panic handlers is redundant.
2025-01-24compiler: yet more panic handler changesmlugg
* `std.builtin.Panic` -> `std.builtin.panic`, because it is a namespace. * `root.Panic` -> `root.panic` for the same reason. There are type checks so that we still allow the legacy `pub fn panic` strategy in the 0.14.0 release. * `std.debug.SimplePanic` -> `std.debug.simple_panic`, same reason. * `std.debug.NoPanic` -> `std.debug.no_panic`, same reason. * `std.debug.FormattedPanic` is now a function `std.debug.FullPanic` which takes as input a `panicFn` and returns a namespace with all the panic functions. This handles the incredibly common case of just wanting to override how the message is printed, whilst keeping nice formatted panics. * Remove `std.builtin.panic.messages`; now, every safety panic has its own function. This reduces binary bloat, as calls to these functions no longer need to prepare any arguments (aside from the error return trace). * Remove some legacy declarations, since a zig1.wasm update has happened. Most of these were related to the panic handler, but a quick grep for "zig1" brought up a couple more results too. Also, add some missing type checks to Sema. Resolves: #22584 formatted -> full
2025-01-22Merge pull request #22572 from jacobly0/new-error-traceMatthew Lugg
compiler: include error trace in all functions, implement for x86_64 backend
2025-01-22compiler: pass error return traces everywheremlugg
2025-01-22Sema: fix crash when `inline` loop condition is not comptime-knownmlugg
2025-01-21compiler: simplify generic functions, fix issues with inline callsmlugg
The original motivation here was to fix regressions caused by #22414. However, while working on this, I ended up discussing a language simplification with Andrew, which changes things a little from how they worked before #22414. The main user-facing change here is that any reference to a prior function parameter, even if potentially comptime-known at the usage site or even not analyzed, now makes a function generic. This applies even if the parameter being referenced is not a `comptime` parameter, since it could still be populated when performing an inline call. This is a breaking language change. The detection of this is done in AstGen; when evaluating a parameter type or return type, we track whether it referenced any prior parameter, and if so, we mark this type as being "generic" in ZIR. This will cause Sema to not evaluate it until the time of instantiation or inline call. A lovely consequence of this from an implementation perspective is that it eliminates the need for most of the "generic poison" system. In particular, `error.GenericPoison` is now completely unnecessary, because we identify generic expressions earlier in the pipeline; this simplifies the compiler and avoids redundant work. This also entirely eliminates the concept of the "generic poison value". The only remnant of this system is the "generic poison type" (`Type.generic_poison` and `InternPool.Index.generic_poison_type`). This type is used in two places: * During semantic analysis, to represent an unknown result type. * When storing generic function types, to represent a generic parameter/return type. It's possible that these use cases should instead use `.none`, but I leave that investigation to a future adventurer. One last thing. Prior to #22414, inline calls were a little inefficient, because they re-evaluated even non-generic parameter types whenever they were called. Changing this behavior is what ultimately led to #22538. Well, because the new logic will mark a type expression as generic if there is any change its resolved type could differ in an inline call, this redundant work is unnecessary! So, this is another way in which the new design reduces redundant work and complexity. Resolves: #22494 Resolves: #22532 Resolves: #22538
2025-01-21Sema: fix `is_non_null_ptr` handling for runtime-known pointersmlugg
We can still often determine a comptime result based on the type, even if the pointer is runtime-known. Also, we previously used load -> is non null instead of AIR `is_non_null_ptr` if the pointer is comptime-known, but that's a bad heuristic. Instead, we should check for the pointer to be comptime-known, *and* for the load to be comptime-known, and only in that case should we call `Sema.analyzeIsNonNull`. Resolves: #22556
2025-01-18Sema: don't try to initialize global union pointer at comptimemlugg
Resolves: #19832
2025-01-18incremental: fix enum resolution bugsmlugg
2025-01-16Sema: prepare for `sentinel` -> `sentinel_ptr` field renamemlugg
The commit 2 after this will explain this diff.
2025-01-16compiler: make it easier to apply breaking changes to `std.builtin`mlugg
Documentation for this will be on the wiki shortly. Resolves: #21842
2025-01-16all: update to `std.builtin.Type.Pointer.Size` field renamesmlugg
This was done by regex substitution with `sed`. I then manually went over the entire diff and fixed any incorrect changes. This diff also changes a lot of `callconv(.C)` to `callconv(.c)`, since my regex happened to also trigger here. I opted to leave these changes in, since they *are* a correct migration, even if they're not the one I was trying to do!
2025-01-15compiler: add type safety for export indicesAndrew Kelley
2025-01-14Sema: more validation for builtin decl typesmlugg
Also improve the source locations when this validation fails. Resolves: #22465
2025-01-14Sema: fix UB in error reportingmlugg
And add test coverage for the compile error in question.
2025-01-13Sema: disallow non scalar sentinels in array types and reified types (#22473)xdBronch
2025-01-13Sema: allow tail calls of function pointersmlugg
Resolves: #22474
2025-01-11compiler: improve "... contains reference to comptime var" errorsmlugg
`Sema.explainWhyValueContainsReferenceToComptimeVar` (concise name!) adds notes to an error explaining how to get from a given `Value` to a pointer to some `comptime var` (or a comptime field). Previously, this error could be very opaque in any case where it wasn't obvious where the comptime var pointer came from; particularly for type captures. Now, the error notes explain this to the user.
2025-01-09Sema: rewrite semantic analysis of function callsmlugg
This rewrite improves some error messages, hugely simplifies the logic, and fixes several bugs. One of these bugs is technically a new rule which Andrew and I agreed on: if a parameter has a comptime-only type but is not declared `comptime`, then the corresponding call argument should not be *evaluated* at comptime; only resolved. Implementing this required changing how function types work a little, which in turn required allowing a new kind of function coercion for some generic use cases: function coercions are now allowed to implicitly *remove* `comptime` annotations from parameters with comptime-only types. This is okay because removing the annotation affects only the call site. Resolves: #22262
2025-01-07Sema: fix invalid AIR from array concatDavid Rubin
2025-01-05Sema: fix incorrect type in `optional_payload` instructionmlugg
Resolves: #22417
2025-01-05compiler: slightly simplify builtin decl memoizationmlugg
Rather than `Zcu.BuiltinDecl.Memoized` being a struct with fields, it can instead just be an array, indexed by the enum. This allows runtime indexing, avoiding a few now-unnecessary `inline` switch cases.
2025-01-04incremental: new `AnalUnit` to group dependencies on `std.builtin` declsmlugg
This commit reworks how values like the panic handler function are memoized during a compiler invocation. Previously, the value was resolved by whichever analysis requested it first, and cached on `Zcu`. This is problematic for incremental compilation, as after the initial resolution, no dependencies are marked by users of this memoized state. This is arguably acceptable for `std.builtin`, but it's definitely not acceptable for the panic handler/messages, because those can be set by the user (`std.builtin.Panic` checks `@import("root").Panic`). So, here we introduce a new kind of `AnalUnit`, called `memoized_state`. There are 3 such units: * `.{ .memoized_state = .va_list }` resolves the type `std.builtin.VaList` * `.{ .memoized_state = .panic }` resolves `std.Panic` * `.{ .memoized_state = .main }` resolves everything else we want These units essentially "bundle" the resolution of their corresponding declarations, storing the results into fields on `Zcu`. This way, when, for instance, a function wants to call the panic handler, it simply runs `ensureMemoizedStateResolved`, registering one dependency, and pulls the values from the `Zcu`. This "bundling" minimizes dependency edges. The 3 units are separated to allow them to act independently: for instance, the panic handler can use `std.builtin.Type` without triggering a dependency loop.
2025-01-04incremental: correctly return `error.AnalysisFail` when type structure changesmlugg
`Zcu.PerThead.ensureTypeUpToDate` is set up in such a way that it only returns the updated type the first time it is called. In general, that's okay; however, the exception is that we want the function to continue returning `error.AnalysisFail` when the type has been lost, or its number of captures changed. Therefore, the check for this case now happens before the up-to-date success return. For simplicity, the number of captures is now handled by intentionally losing the instruction in `Zcu.mapOldZirToNew`, since there is nothing to gain from tracking a type when old instances of it can never be reused.
2025-01-03Zir: split up start and end of range in `for_len`mlugg
The old lowering was kind of neat, but it unintentionally allowed the syntax `for (123) |_| { ... }`, and there wasn't really a way to fix that. So, instead, we include both the start and the end of the range in the `for_len` instruction (each operand to `for` now has *two* entries in this multi-op instruction). This slightly increases the size of ZIR for loops of predominantly indexables, but the difference is small enough that it's not worth complicating ZIR to try and fix it.
2025-01-02Sema: correctly label `block_comptime` for restoring error return trace indexmlugg
Resolves: #22384
2025-01-01Sema: fix invalid coercion `*[n:x]T` -> `*[m]T` for `n != m`mlugg
The change in `Sema.coerceExtra` is just to avoid an unhelpful error message, covered by the added test case. Resolves: #22373
2024-12-31Sema: add doc comments for comptime reason typesmlugg
2024-12-31Sema: remove some incorrect calls to `requireRuntimeBlock`mlugg
Most calls to `requireRuntimeBlock` in Sema are not correct. This function doesn't deal with all of them, but it does deal with ones which have, in combination with the past few commits, introduced real-world regressions. Related: #22353
2024-12-31compiler: ensure local `const`s in comptime scope are comptime-knownmlugg
This fixes a bug which exposed a compiler implementation detail (ZIR alloc elision). Previously, `const` declarations with a runtime-known value in a comptime scope were permitted only if AstGen was able to elide the alloc in ZIR, since the error was reported by storing to the comptime alloc. This just adds a new instruction to also emit this error when the alloc is elided.
2024-12-31compiler: ensure result of `block_comptime` is comptime-knownmlugg
To avoid this PR regressing error messages, most of the work here has gone towards improving error notes for why code was comptime-evaluated. ZIR `block_comptime` now stores a "comptime reason", the enum for which is also used by Sema. There are two types in Sema: * `ComptimeReason` represents the reason we started evaluating something at comptime. * `BlockComptimeReason` represents the reason a given block is evaluated at comptime; it's either a `ComptimeReason` with an attached source location, or it's because we're in a function which was called at comptime (and that function's `Block` should be consulted for the "parent" reason). Every `Block` stores a `?BlockComptimeReason`. The old `is_comptime` field is replaced with a trivial `isComptime()` method which returns whether that reason is non-`null`. Lastly, the handling for `block_comptime` has been simplified. It was previously going through an unnecessary runtime-handling path; now, it is a trivial sub block exited through a `break_inline` instruction. Resolves: #22296
2024-12-24compiler: analyze type and value of global declaration separatelymlugg
This commit separates semantic analysis of the annotated type vs value of a global declaration, therefore allowing recursive and mutually recursive values to be declared. Every `Nav` which undergoes analysis now has *two* corresponding `AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val` unit is responsible for *fully resolving* the `Nav`: determining its value, linksection, addrspace, etc. The `nav_ty` unit, on the other hand, resolves only the information necessary to construct a *pointer* to the `Nav`: its type, addrspace, etc. (It does also analyze its linksection, but that could be moved to `nav_val` I think; it doesn't make any difference). Analyzing a `nav_ty` for a declaration with no type annotation will just mark a dependency on the `nav_val`, analyze it, and finish. Conversely, analyzing a `nav_val` for a declaration *with* a type annotation will first mark a dependency on the `nav_ty` and analyze it, using this as the result type when evaluating the value body. The `nav_val` and `nav_ty` units always have references to one another: so, if a `Nav`'s type is referenced, its value implicitly is too, and vice versa. However, these dependencies are trivial, so, to save memory, are only known implicitly by logic in `resolveReferences`. In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the corresponding `Nav`. There are two exceptions to this. If the declaration is an `extern` declaration, then we immediately ensure the `Nav` value is resolved (which doesn't actually require any more analysis, since such a declaration has no value body anyway). Additionally, if the resolved type has type tag `.@"fn"`, we again immediately resolve the `Nav` value. The latter restriction is in place for two reasons: * Functions are special, in that their externs are allowed to trivially alias; i.e. with a declaration `extern fn foo(...)`, you can write `const bar = foo;`. This is not allowed for non-function externs, and it means that function types are the only place where it is possible for a declaration `Nav` to have a `.@"extern"` value without actually being declared `extern`. We need to identify this situation immediately so that the `decl_ref` can create a pointer to the *real* extern `Nav`, not this alias. * In certain situations, such as taking a pointer to a `Nav`, Sema needs to queue analysis of a runtime function if the value is a function. To do this, the function value needs to be known, so we need to resolve the value immediately upon `&foo` where `foo` is a function. This restriction is simple to codify into the eventual language specification, and doesn't limit the utility of this feature in practice. A consequence of this commit is that codegen and linking logic needs to be more careful when looking at `Nav`s. In general: * When `updateNav` or `updateFunc` is called, it is safe to assume that the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully resolved. * Any `Nav` whose value is/will be an `@"extern"` or a function is fully resolved; see `Nav.getExtern` for a helper for a common case here. * Any other `Nav` may only have its type resolved. This didn't seem to be too tricky to satisfy in any of the existing codegen/linker backends. Resolves: #131
2024-12-24compiler: remove Caumlugg
The `Cau` abstraction originated from noting that one of the two primary roles of the legacy `Decl` type was to be the subject of comptime semantic analysis. However, the data stored in `Cau` has always had some level of redundancy. While preparing for #131, I went to remove that redundany, and realised that `Cau` now had exactly one field: `owner`. This led me to conclude that `Cau` is, in fact, an unnecessary level of abstraction over what are in reality *fundamentally different* kinds of analysis unit (`AnalUnit`). Types, `Nav` vals, and `comptime` declarations are all analyzed in different ways, and trying to treat them as the same thing is counterproductive! So, these 3 cases are now different alternatives in `AnalUnit`. To avoid stealing bits from `InternPool`-based IDs, which are already a little starved for bits due to the sharding datastructures, `AnalUnit` is expanded to 64 bits (30 of which are currently unused). This doesn't impact memory usage too much by default, because we don't store `AnalUnit`s all too often; however, we do store them a lot under `-fincremental`, so a non-trivial bump to peak RSS can be observed there. This will be improved in the future when I made `InternPool.DepEntry` less memory-inefficient. `Zcu.PerThread.ensureCauAnalyzed` is split into 3 functions, for each of the 3 new types of `AnalUnit`. The new logic is much easier to understand, because it avoids conflating the logic of these fundamentally different cases.
2024-12-23Zir: refactor `declaration` instruction representationmlugg
The new representation is often more compact. It is also more straightforward to understand: for instance, `extern` is represented on the `declaration` instruction itself rather than using a special instruction. The same applies to `var`, making both of these far more compact. This commit also separates the type and value bodies of a `declaration` instruction. This is a prerequisite for #131. In general, `declaration` now directly encodes details of the syntax form used, and the embedded ZIR bodies are for actual expressions. The only exception to this is functions, where ZIR is effectively designed as if we had #1717. `extern fn` declarations are modeled as `extern const` with a function type, and normal `fn` definitions are modeled as `const` with a `func{,_fancy,_inferred}` instruction. This may change in the future, but improving on this was out of scope for this commit.
2024-12-18compiler: disallow `callconv` etc from depending on function parametersmlugg
Resolves: #22261
2024-12-18compiler: move `RuntimeIndex` to `Sema`mlugg
Just a small refactor.
2024-12-16Sema: disallow unsafe in-memory coercionsmlugg
The error messages here aren't amazing yet, but this is an improvement on status quo, because the current behavior allows false negative compile errors, so effectively miscompiles. Resolves: #15874
2024-12-16Merge pull request #22245 from mlugg/zir-no-doc-commentsMatthew Lugg
compiler: remove doc comments from Zir
2024-12-15compiler: remove doc comments from Zirmlugg
This code was left over from the legacy Autodoc implementation. No component of the compiler pipeline actually requires doc comments, so it is a waste of time and space to store them in ZIR.
2024-12-15Sema: disallow runtime stores to pointers with comptime-only element typesmlugg
2024-12-15Sema: do not allow coercing undefined to opaque typesmlugg
2024-12-14ensure `InstMap` capacity before remapping error codeDavid Rubin
2024-12-09Merge pull request #22157 from mlugg/astgen-error-lazyAndrew Kelley
compiler: allow semantic analysis of files with AstGen errors
2024-12-09Merge pull request #22164 from mlugg/astgen-ref-dedupAndrew Kelley
AstGen: correctly deduplicate `ref` of `param` and `alloc_inferred`
2024-12-08Sema: fix use of Zcu.LazySrcLoc in error messagewooster0
It currently prints as: :3:18: error: untagged union 'Zcu.LazySrcLoc{ .base_node_inst = InternPool.TrackedInst.Index(104), .offset = Zcu.LazySrcLoc.Offset{ .node_offset = Zcu.LazySrcLoc.Offset.TracedOffset{ .x = -2, .trace = (value tracing disabled) } } }' cannot be converted to integer
2024-12-08AstGen: correctly deduplicate `ref` of `param` and `alloc_inferred`mlugg
Both of these instructions were previously under a special case in `rvalue` which resulted in every reference to such an instruction adding a new `ref` instruction. This had the effect that, for instance, `&a != &a` for parameters. Deduplicating these `ref` instructions was problematic for different reasons. For `alloc_inferred`, the problem was that it's not valid to `ref` the alloc until the allocation has been resolved (`resolve_inferred_alloc`), but `AstGen.appendBodyWithFixups` would place the `ref` directly after the `alloc_inferred`. This is solved by bringing `resolve_inferred_alloc` in line with `make_ptr_const` by having it *return* the final pointer, rather than modifying `sema.inst_map` of the original `alloc_inferred`. That way, the `ref` refers to the `resolve_inferred_alloc` instruction, so is placed immediately after it, avoiding this issue. For `param`, the problem is a bit trickier: `param` instructions live in a body which must contain only `param` instructions, then a `func{,_inferred,_fancy}`, then a `break_inline`. Moreover, `param` instructions may be referenced not only by the function body, but also by other parameters, the return type expression, etc. Each of these bodies requires separate `ref` instructions. This is solved by pulling entries out of `ref_table` after evaluating each component of the function declaration, and appending the refs later on when actually putting the bodies together. This gives way to another issue: if you write `fn f(x: T) @TypeOf(x.foo())`, then since `x.foo()` takes a reference to `x`, this `ref` instruction is now in a comptime context (outside of the `@TypeOf` ZIR body), so emits a compile error. This is solved by loosening the rules around `ref` instructions; because they are not side-effecting, it is okay to allow `ref` of runtime values at comptime, resulting in a runtime-known value in a comptime scope. We already apply this mechanism in some cases; for instance, it's why `runtime_array.len` works in a `comptime` context. In future, we will want to give similar treatment to many operations in Sema: in general, it's fine to apply runtime operations at comptime provided they don't have side effects! Resolves: #22140