| Age | Commit message (Collapse) | Author |
|
avoids unnecessary copies
|
|
with field_ptr_load and field_ptr_named_load.
These avoid doing by-val load operations for structs that are
runtime-known while keeping the previous semantics for comptime-known
values.
|
|
and delete deprecated alias std.io
|
|
Uses a new `float_op_result_ty` ZIR instruction tag.
|
|
|
|
If both are used, 'else' handles named members and '_' handles
unnamed members. In this case the 'else' prong will be unrolled
to an explicit case containing all remaining named values.
|
|
Mainly affects ZIR representation of switch_block[_ref]
and special prong (detection) logic for switch.
Adds a new SpecialProng tag 'absorbing_under' that allows
specifying additional explicit tags in a '_' prong which
are respected when checking that every value is handled
during semantic analysis but are not transformed into AIR
and instead 'absorbed' by the '_' branch.
|
|
|
|
|
|
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API
make std.testing.expectFmt work at compile-time
std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.
Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
- anytype -> *std.io.Writer
- inferred error set -> error{WriteFailed}
- options -> (deleted)
* std.fmt.Formatted
- now takes context type explicitly
- no fmt string
|
|
closes #20663
|
|
Also remove `@frameSize`, closing #3654.
While the other machinery might remain depending on #23446, it is
settled that there will not be `async`/ `await` keywords in the
language.
|
|
This commit makes some big changes to how we track state for Zig source
files. In particular, it changes:
* How `File` tracks its path on-disk
* How AstGen discovers files
* How file-level errors are tracked
* How `builtin.zig` files and modules are created
The original motivation here was to address incremental compilation bugs
with the handling of files, such as #22696. To fix this, a few changes
are necessary.
Just like declarations may become unreferenced on an incremental update,
meaning we suppress analysis errors associated with them, it is also
possible for all imports of a file to be removed on an incremental
update, in which case file-level errors for that file should be
suppressed. As such, after AstGen, the compiler must traverse files
(starting from analysis roots) and discover the set of "live files" for
this update.
Additionally, the compiler's previous handling of retryable file errors
was not very good; the source location the error was reported as was
based only on the first discovered import of that file. This source
location also disappeared on future incremental updates. So, as a part
of the file traversal above, we also need to figure out the source
locations of imports which errors should be reported against.
Another observation I made is that the "file exists in multiple modules"
error was not implemented in a particularly good way (I get to say that
because I wrote it!). It was subject to races, where the order in which
different imports of a file were discovered affects both how errors are
printed, and which module the file is arbitrarily assigned, with the
latter in turn affecting which other files are considered for import.
The thing I realised here is that while the AstGen worker pool is
running, we cannot know for sure which module(s) a file is in; we could
always discover an import later which changes the answer.
So, here's how the AstGen workers have changed. We initially ensure that
`zcu.import_table` contains the root files for all modules in this Zcu,
even if we don't know any imports for them yet. Then, the AstGen
workers do not need to be aware of modules. Instead, they simply ignore
module imports, and only spin off more workers when they see a by-path
import.
During AstGen, we can't use module-root-relative paths, since we don't
know which modules files are in; but we don't want to unnecessarily use
absolute files either, because those are non-portable and can make
`error.NameTooLong` more likely. As such, I have introduced a new
abstraction, `Compilation.Path`. This type is a way of representing a
filesystem path which has a *canonical form*. The path is represented
relative to one of a few special directories: the lib directory, the
global cache directory, or the local cache directory. As a fallback, we
use absolute (or cwd-relative on WASI) paths. This is kind of similar to
`std.Build.Cache.Path` with a pre-defined list of possible
`std.Build.Cache.Directory`, but has stricter canonicalization rules
based on path resolution to make sure deduplicating files works
properly. A `Compilation.Path` can be trivially converted to a
`std.Build.Cache.Path` from a `Compilation`, but is smaller, has a
canonical form, and has a digest which will be consistent across
different compiler processes with the same lib and cache directories
(important when we serialize incremental compilation state in the
future). `Zcu.File` and `Zcu.EmbedFile` both contain a
`Compilation.Path`, which is used to access the file on-disk;
module-relative sub paths are used quite rarely (`EmbedFile` doesn't
even have one now for simplicity).
After the AstGen workers all complete, we know that any file which might
be imported is definitely in `import_table` and up-to-date. So, we
perform a single-threaded graph traversal; similar to what
`resolveReferences` plays for `AnalUnit`s, but for files instead. We
figure out which files are alive, and which module each file is in. If a
file turns out to be in multiple modules, we set a field on `Zcu` to
indicate this error. If a file is in a different module to a prior
update, we set a flag instructing `updateZirRefs` to invalidate all
dependencies on the file. This traversal also discovers "import errors";
these are errors associated with a specific `@import`. With Zig's
current design, there is only one possible error here: "import outside
of module root". This must be identified during this traversal instead
of during AstGen, because it depends on which module the file is in. I
tried also representing "module not found" errors in this same way, but
it turns out to be much more useful to report those in Sema, because of
use cases like optional dependencies where a module import is behind a
comptime-known build option.
For simplicity, `failed_files` now just maps to `?[]u8`, since the
source location is always the whole file. In fact, this allows removing
`LazySrcLoc.Offset.entire_file` completely, slightly simplifying some
error reporting logic. File-level errors are now directly built in the
`std.zig.ErrorBundle.Wip`. If the payload is not `null`, it is the
message for a retryable error (i.e. an error loading the source file),
and will be reported with a "file imported here" note pointing to the
import site discovered during the single-threaded file traversal.
The last piece of fallout here is how `Builtin` works. Rather than
constructing "builtin" modules when creating `Package.Module`s, they are
now constructed on-the-fly by `Zcu`. The map `Zcu.builtin_modules` maps
from digests to `*Package.Module`s. These digests are abstract hashes of
the `Builtin` value; i.e. all of the options which are placed into
"builtin.zig". During the file traversal, we populate `builtin_modules`
as needed, so that when we see this imports in Sema, we just grab the
relevant entry from this map. This eliminates a bunch of awkward state
tracking during construction of the module graph. It's also now clearer
exactly what options the builtin module has, since previously it
inherited some options arbitrarily from the first-created module with
that "builtin" module!
The user-visible effects of this commit are:
* retryable file errors are now consistently reported against the whole
file, with a note pointing to a live import of that file
* some theoretical bugs where imports are wrongly considered distinct
(when the import path moves out of the cwd and then back in) are fixed
* some consistency issues with how file-level errors are reported are
fixed; these errors will now always be printed in the same order
regardless of how the AstGen pass assigns file indices
* incremental updates do not print retryable file errors differently
between updates or depending on file structure/contents
* incremental updates support files changing modules
* incremental updates support files becoming unreferenced
Resolves: #22696
|
|
|
|
This commits adds the following distinct integer types to std.zig.Ast:
- OptionalTokenIndex
- TokenOffset
- OptionalTokenOffset
- Node.OptionalIndex
- Node.Offset
- Node.OptionalOffset
The `Node.Index` type has also been converted to a distinct type while
`TokenIndex` remains unchanged.
`Ast.Node.Data` has also been changed to a (untagged) union to provide
safety checks.
|
|
This reverts commit dea72d15da4fba909dc3ccb2e9dc5286372ac023, reversing
changes made to ab381933c87bcc744058d25a876cfdc0d23fc674.
The changeset does not work as advertised and does not have sufficient
test coverage.
Reopens #22822
|
|
|
|
Closes #21833.
Closes #22110.
|
|
Resolves: #21867
|
|
Instead, `source`, `tree`, and `zir` should all be optional. This is
precisely what we're actually trying to model here; and `File` isn't
optimized for memory consumption or serializability anyway, so it's fine
to use a couple of extra bytes on actual optionals here.
|
|
This commit allows using ZON (Zig Object Notation) in a few ways.
* `@import` can be used to load ZON at comptime and convert it to a
normal Zig value. In this case, `@import` must have a result type.
* `std.zon.parse` can be used to parse ZON at runtime, akin to the
parsing logic in `std.json`.
* `std.zon.stringify` can be used to convert arbitrary data structures
to ZON at runtime, again akin to `std.json`.
|
|
This commit effectively reverts 9e683f0, and hence un-accepts #19777.
While nice in theory, this proposal turned out to have a few problems.
Firstly, supplying a result type implicitly coerces the operand to this
type -- that's the main point of result types! But for `try`, this is
actually a bad idea; we want a redundant `try` to be a compile error,
not to silently coerce the non-error value to an error union. In
practice, this didn't always happen, because the implementation was
buggy anyway; but when it did, it was really quite silly. For instance,
`try try ... try .{ ... }` was an accepted expression, with the inner
initializer being initially coerced to `E!E!...E!T`.
Secondly, the result type inference here didn't play nicely with
`return`. If you write `return try`, the operand would actually receive
a result type of `E!E!T`, since the `return` gave a result type of `E!T`
and the `try` wrapped it in *another* error union. More generally, the
problem here is that `try` doesn't know when it should or shouldn't
nest error unions. This occasionally broke code which looked like it
should work.
So, this commit prevents `try` from propagating result types through to
its operand. A key motivation for the original proposal here was decl
literals; so, as a special case, `try .foo(...)` is still an allowed
syntax form, caught by AstGen and specially lowered. This does open the
doors to allowing other special cases for decl literals in future, such
as `.foo(...) catch ...`, but those proposals are for another time.
Resolves: #21991
Resolves: #22633
|
|
The original motivation here was to fix regressions caused by #22414.
However, while working on this, I ended up discussing a language
simplification with Andrew, which changes things a little from how they
worked before #22414.
The main user-facing change here is that any reference to a prior
function parameter, even if potentially comptime-known at the usage
site or even not analyzed, now makes a function generic. This applies
even if the parameter being referenced is not a `comptime` parameter,
since it could still be populated when performing an inline call. This
is a breaking language change.
The detection of this is done in AstGen; when evaluating a parameter
type or return type, we track whether it referenced any prior parameter,
and if so, we mark this type as being "generic" in ZIR. This will cause
Sema to not evaluate it until the time of instantiation or inline call.
A lovely consequence of this from an implementation perspective is that
it eliminates the need for most of the "generic poison" system. In
particular, `error.GenericPoison` is now completely unnecessary, because
we identify generic expressions earlier in the pipeline; this simplifies
the compiler and avoids redundant work. This also entirely eliminates
the concept of the "generic poison value". The only remnant of this
system is the "generic poison type" (`Type.generic_poison` and
`InternPool.Index.generic_poison_type`). This type is used in two
places:
* During semantic analysis, to represent an unknown result type.
* When storing generic function types, to represent a generic parameter/return type.
It's possible that these use cases should instead use `.none`, but I
leave that investigation to a future adventurer.
One last thing. Prior to #22414, inline calls were a little inefficient,
because they re-evaluated even non-generic parameter types whenever they
were called. Changing this behavior is what ultimately led to #22538.
Well, because the new logic will mark a type expression as generic if
there is any change its resolved type could differ in an inline call,
this redundant work is unnecessary! So, this is another way in which the
new design reduces redundant work and complexity.
Resolves: #22494
Resolves: #22532
Resolves: #22538
|
|
`Sema.explainWhyValueContainsReferenceToComptimeVar` (concise name!)
adds notes to an error explaining how to get from a given `Value` to a
pointer to some `comptime var` (or a comptime field). Previously, this
error could be very opaque in any case where it wasn't obvious where the
comptime var pointer came from; particularly for type captures. Now, the
error notes explain this to the user.
|
|
This fixes a bug which exposed a compiler implementation detail (ZIR
alloc elision). Previously, `const` declarations with a runtime-known
value in a comptime scope were permitted only if AstGen was able to
elide the alloc in ZIR, since the error was reported by storing to the
comptime alloc.
This just adds a new instruction to also emit this error when the alloc
is elided.
|
|
To avoid this PR regressing error messages, most of the work here has
gone towards improving error notes for why code was comptime-evaluated.
ZIR `block_comptime` now stores a "comptime reason", the enum for which
is also used by Sema. There are two types in Sema:
* `ComptimeReason` represents the reason we started evaluating something
at comptime.
* `BlockComptimeReason` represents the reason a given block is evaluated
at comptime; it's either a `ComptimeReason` with an attached source
location, or it's because we're in a function which was called at
comptime (and that function's `Block` should be consulted for the
"parent" reason).
Every `Block` stores a `?BlockComptimeReason`. The old `is_comptime`
field is replaced with a trivial `isComptime()` method which returns
whether that reason is non-`null`.
Lastly, the handling for `block_comptime` has been simplified. It was
previously going through an unnecessary runtime-handling path; now, it
is a trivial sub block exited through a `break_inline` instruction.
Resolves: #22296
|
|
The new representation is often more compact. It is also more
straightforward to understand: for instance, `extern` is represented on
the `declaration` instruction itself rather than using a special
instruction. The same applies to `var`, making both of these far more
compact.
This commit also separates the type and value bodies of a `declaration`
instruction. This is a prerequisite for #131.
In general, `declaration` now directly encodes details of the syntax
form used, and the embedded ZIR bodies are for actual expressions. The
only exception to this is functions, where ZIR is effectively designed
as if we had #1717. `extern fn` declarations are modeled as
`extern const` with a function type, and normal `fn` definitions are
modeled as `const` with a `func{,_fancy,_inferred}` instruction. This
may change in the future, but improving on this was out of scope for
this commit.
|
|
Resolves: #22261
|
|
This code was left over from the legacy Autodoc implementation. No
component of the compiler pipeline actually requires doc comments, so it
is a waste of time and space to store them in ZIR.
|
|
This commit enhances AstGen to introduce a form of error resilience
which allows valid ZIR to be emitted even when AstGen errors occur.
When a non-fatal AstGen error (e.g. `appendErrorNode`) occurs, ZIR
generation is not affected; the error is added to `astgen.errors` and
ultimately to the errors stored in `extra`, but that doesn't stop us
getting valid ZIR. Fatal AstGen errors (e.g. `failNode`) are a bit
trickier. These errors return `error.AnalysisFail`, which is propagated
up the stack. In theory, any parent expression can catch this error and
handle it, continuing ZIR generation whilst throwing away whatever was
lost. For now, we only do this in one place: when creating declarations.
If a call to `fnDecl`, `comptimeDecl`, `globalVarDecl`, etc, returns
`error.AnalysisFail`, the `declaration` instruction is still created,
but its body simply contains the new `extended(astgen_error())`
instruction, which instructs Sema to terminate semantic analysis with a
transitive error. This means that a fatal AstGen error causes the
innermost declaration containing the error to fail, but the rest of the
file remains intact.
If a source file contains parse errors, or an `error.AnalysisFail`
happens when lowering the top-level struct (e.g. there is an error in
one of its fields, or a name has multiple declarations), then lowering
for the entire file fails. Alongside the existing `Zir.hasCompileErrors`
query, this commit introduces `Zir.loweringFailed`, which returns `true`
only in this case.
The end result here is that files with AstGen failures will almost
always still emit valid ZIR, and hence can undergo semantic analysis on
the parts of the file which are (from AstGen's perspective) valid. This
is a noteworthy improvement to UX, but the main motivation here is
actually incremental compilation. Previously, AstGen failures caused
lots of semantic analysis work to be thrown out, because all `AnalUnit`s
in the file required re-analysis so as to trigger necessary transitive
failures and remove stored compile errors which would no longer make
sense (because a fresh compilation of this code would not emit those
errors, as the units those errors applied to would fail sooner due to
referencing a failed file). Now, this case only applies when a file has
severe top-level errors, which is far less common than something like
having an unused variable.
Lastly, this commit changes a few errors in `AstGen` to become fatal
when they were previously non-fatal and vice versa. If there is still a
reasonable way to continue AstGen and lower to ZIR after an error, it is
non-fatal; otherwise, it is fatal. For instance, `comptime const`, while
redundant syntax, has a clear meaning we can lower; on the other hand,
using an undeclared identifer has no sane lowering, so must trigger a
fatal error.
|
|
Previously, stepping from the single statement within the loop would
always exit the loop because all of the code unrolled from the loop is
associated with the same line and treated by the debugger as one line.
|
|
This commit reworks how anonymous struct literals and tuples work.
Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.
Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.
This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.
Resolves: #16865
|
|
This commit finishes implementing #21209 by removing the
`@setAlignStack` builtin in favour of `CallingConvention` payloads. The
x86_64 backend is updated to use the stack alignment given in the
calling convention (the LLVM backend was already updated in a previous
commit).
Resolves: #21209
|
|
Resolves: #20433
|
|
closes #11650
|
|
Resolves: #21341
|
|
compiler: implement labeled switch/continue
|
|
|
|
Resolves: #9938
|
|
This is mainly useful in conjunction with Decl Literals (#9938).
Resolves: #19777
|
|
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.
This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
|
|
Implements the accepted proposal to introduce `@branchHint`. This
builtin is permitted as the first statement of a block if that block is
the direct body of any of the following:
* a function (*not* a `test`)
* either branch of an `if`
* the RHS of a `catch` or `orelse`
* a `switch` prong
* an `or` or `and` expression
It lowers to the ZIR instruction `extended(branch_hint(...))`. When Sema
encounters this instruction, it sets `sema.branch_hint` appropriately,
and `zirCondBr` etc are expected to reset this value as necessary. The
state is on `Sema` rather than `Block` to make it automatically
propagate up non-conditional blocks without special handling. If
`@panic` is reached, the branch hint is set to `.cold` if none was
already set; similarly, error branches get a hint of `.unlikely` if no
hint is explicitly provided. If a condition is comptime-known, `cold`
hints from the taken branch are allowed to propagate up, but other hints
are discarded. This is because a `likely`/`unlikely` hint just indicates
the direction this branch is likely to go, which is redundant
information when the branch is known at comptime; but `cold` hints
indicate that control flow is unlikely to ever reach this branch,
meaning if the branch is always taken from its parent, then the parent
is also unlikely to ever be reached.
This branch information is stored in AIR `cond_br` and `switch_br`. In
addition, `try` and `try_ptr` instructions have variants `try_cold` and
`try_ptr_cold` which indicate that the error case is cold (rather than
just unlikely); this is reachable through e.g. `errdefer unreachable` or
`errdefer @panic("")`.
A new API `unwrapSwitch` is introduced to `Air` to make it more
convenient to access `switch_br` instructions. In time, I plan to update
all AIR instructions to be accessed via an `unwrap` method which returns
a convenient tagged union a la `InternPool.indexToKey`.
The LLVM backend lowers branch hints for conditional branches and
switches as follows:
* If any branch is marked `unpredictable`, the instruction is marked
`!unpredictable`.
* Any branch which is marked as `cold` gets a
`llvm.assume(i1 true) [ "cold"() ]` call to mark the code path cold.
* If any branch is marked `likely` or `unlikely`, branch weight metadata
is attached with `!prof`. Likely branches get a weight of 2000, and
unlikely branches a weight of 1. In `switch` statements, un-annotated
branches get a weight of 1000 as a "middle ground" hint, since there
could be likely *and* unlikely *and* un-annotated branches.
For functions, a `cold` hint corresponds to the `cold` function
attribute, and other hints are currently ignored -- as far as I can tell
LLVM doesn't really have a way to lower them. (Ideally, we would want
the branch hint given in the function to propagate to call sites.)
The compiler and standard library do not yet use this new builtin.
Resolves: #21148
|
|
Resolves: #14911
|
|
This replaces the constant `Zir.Inst.Ref` tags (and the analagous tags
in `Air.Inst.Ref`, `InternPool.Index`) referring to types in
`std.builtin` with a ZIR instruction `extended(builtin_type(...))` which
instructs Sema to fetch such a type, effectively as if it were a
shorthand for the ZIR for `@import("std").builtin.xyz`.
Previously, this was achieved through constant tags in `Ref`. The
analagous `InternPool` indices began as `simple_type` values, and were
later rewritten to the correct type information. This system was kind of
brittle, and more importantly, isn't compatible with incremental
compilation of std, since incremental compilation relies on the ability
to recreate types at different indices when they change. Replacing the
old system with this instruction slightly increases the size of ZIR, but
it simplifies logic and allows incremental compilation to work correctly
on the standard library.
This shouldn't have a significant impact on ZIR size or compiler
performance, but I will take measurements in the PR to confirm this.
|
|
This is in preparation for incremental and actually being able to debug
executables built by the x86_64 backend.
|
|
This is needed to ensure that start code does not try to access thread
local storage before it has set up thread local storage.
|
|
|
|
We need special logic for updating line numbers anyway, so it's fine to
just use absolute numbers here. This eliminates a field from `Decl`.
|
|
This patch is a pure rename plus only changing the file path in
`@import` sites, so it is expected to not create version control
conflicts, even when rebasing.
|
|
Since we track `reify` instructions across incremental updates, it is
acceptable to treat it as the baseline for a relative source location.
This turns out to be a good idea, since it makes it easy to define the
source location for a reified type.
|