| Age | Commit message (Collapse) | Author |
|
|
|
types
|
|
comptime memory
I have no idea how correct this code is, but I'm working on a full
rewrite of this logic anyway, and this certainly seems more correct than
before.
|
|
This was previously implemented by analyzing the AIR prior to the ZIR
`make_ptr_const` instruction. This solution was highly delicate, and in
particular broke down whenever there was a second `alloc` between the
`store` and `alloc` instructions, which is especially common in
destructure statements.
Sema now uses a different strategy to detect whether a `const` is
comptime-known. When the `alloc` is created, Sema begins tracking all
pointers and stores which refer to that allocation in temporary local
state. If any store is not comptime-known or has a higher runtime index
than the allocation, the allocation is marked as being runtime-known.
When we reach the `make_ptr_const` instruction, if the allocation is not
marked as runtime-known, it must be comptime-known. Sema will use the
set of `store` instructions to re-initialize the value in comptime
memory. We optimize for the common case of a single `store` instruction
by not creating a comptime alloc in this case, instead directly plucking
the result value from the instruction.
Resolves: #16083
|
|
Previously, @memset at comptime performed N pointer stores. This is less
efficient than just storing a whole array of values at once. The
difference can be quite drastic when reinterpreting memory - a test case
which is 40s on master branch now takes under a second on a debug
compiler build.
Resolves: #17214
|
|
This required a third `if (found_already == null)` in another place in
AstGen.zig for this special case of `@export`.
Fixes #17188
|
|
When struct types have no field names, the names are implicitly
understood to be strings corresponding to the field indexes in
declaration order. It used to be the case that a NullTerminatedString
would be stored for each field in this case, however, now, callers must
handle the possibility that there are no names stored at all. This
commit introduces `legacyStructFieldName`, a function to fake the
previous behavior. Probably something better could be done by reworking
all the callsites of this function.
|
|
|
|
|
|
This is an alternate version of c675daac7d0d6328ad89297e8938a6b17b59c489
|
|
This changeset fixes the handling of alignment in several places. The
new rules are:
* `@alignOf(T)` where `T` is a runtime zero-bit type is at least 1,
maybe greater.
* Zero-bit fields in `extern` structs *do* force alignment, potentially
offsetting following fields.
* Zero-bit fields *do* have addresses within structs which can be
observed and are consistent with `@offsetOf`.
These are not necessarily all implemented correctly yet (see disabled
test), but this commit fixes all regressions compared to master, and
makes one new test pass.
|
|
|
|
|
|
Previously it would canonicalize or not depending on some volatile
internal state of the compiler, now it forces resolution of the element
type to determine the alignment if it needs to.
|
|
|
|
|
|
We're hitting false compile errors, but this is progress!
|
|
Let's try to reduce the explosive scope of this branch.
|
|
|
|
This also modifies AstGen so that struct types use 1 bit each from the
flags to communicate if there are nonzero inits, alignments, or comptime
fields. This allows adding a struct type to the InternPool without
looking ahead in memory to find out the answers to these questions,
which is easier for CPUs as well as for me, coding this logic right now.
|
|
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.
After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.
Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.
I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
|
|
Currently, the compiler (like @typeName) writes it `fn(...) Type` but
zig fmt writes it `fn (...) Type` (notice the space after `fn`).
This inconsistency is now resolved and function types are consistently
written the zig fmt way. Before this there were more `fn (...) Type`
occurrences than `fn(...) Type` already.
|
|
This is supposed to be the case, similar to how pointers to generic
functions are comptime-only (several pieces of logic already assumed
this). These types being considered runtime was causing `dbg_var_val`
AIR instructions to be wrongly emitted for such values, causing codegen
backends to create a runtime reference to the inline function, which (at
least on the LLVM backend) triggers an error.
Resolves: #38
|
|
|
|
This change implements the following syntax into the compiler:
```zig
const x: u32, var y, foo.bar = .{ 1, 2, 3 };
```
A destructure expression may only appear within a block (i.e. not at
comtainer scope). The LHS consists of a sequence of comma-separated var
decls and/or lvalue expressions. The RHS is a normal expression.
A new result location type, `destructure`, is used, which contains
result pointers for each component of the destructure. This means that
when the RHS is a more complicated expression, peer type resolution is
not used: each result value is individually destructured and written to
the result pointers. RLS is always used for destructure expressions,
meaning every `const` on the LHS of such an expression creates a true
stack allocation.
Aside from anonymous array literals, Sema is capable of destructuring
the following types:
* Tuples
* Arrays
* Vectors
A destructure may be prefixed with the `comptime` keyword, in which case
the entire destructure is evaluated at comptime: this means all `var`s
in the LHS are `comptime var`s, every lvalue expression is evaluated at
comptime, and the RHS is evaluated at comptime. If every LHS is a
`const`, this is not allowed: as with single declarations, the user
should instead mark the RHS as `comptime`.
There are a few subtleties in the grammar changes here. For one thing,
if every LHS is an lvalue expression (rather than a var decl), a
destructure is considered an expression. This makes, for instance,
`if (cond) x, y = .{ 1, 2 };` valid Zig code. A destructure is allowed
in almost every context where a standard assignment expression is
permitted. The exception is `switch` prongs, which cannot be
destructures as the comma is ambiguous with the end of the prong.
A follow-up commit will begin utilizing this syntax in the Zig compiler.
Resolves: #498
|
|
When RLS is used to initialize a value with a comptime-only type, the
usual "value with comptime-only type depends on runtime control flow"
error message isn't hit, because we don't use results from a block. When
we reach `make_ptr_const`, we must validate that the value is
comptime-known.
|
|
* Use 32-bit integers instead of pointers for compactness and
serialization friendliness.
* Use a separate hash map for runtime and comptime capture scopes,
avoiding the 1-bit union tag.
* Use a compact array representation instead of a tree of hash maps.
* Eliminate the only instance of ref-counting in the compiler, instead
relying on garbage collection (not implemented yet but is the plan for
almost all long-lived objects related to incremental compilation).
Because a code modification may need to access capture scope data, this
makes capture scope data long-lived state. My goal is to get incremental
compilation state serialization down to a single pwritev syscall, by
unifying the on-disk representation with the in-memory representation.
This commit eliminates the last remaining pointer field of
`Module.Decl`.
|
|
Instead of using actual slices for InternPool.Key.AnonStructType, this
commit changes to use Slice types instead, which store a
long-lived index rather than a pointer.
This is a follow-up to 7ef1eb1c27754cb0349fdc10db1f02ff2dddd99b.
|
|
* remove unreachable code
* remove already handled cases
* avoid `InternPool.getCoerced`
* add some undef checks
* error when converting undef int to float
Closes #16987
|
|
|
|
This allows reference traces to begin at the outermost inline call.
|
|
We will need this in a bit...
|
|
This makes the call sites easier to read, reduces the number of `catch`
expressions required, and prepares for comptime reasons to appear
earlier in the list of notes.
|
|
|
|
|
|
Resolves: #16986
|
|
The following cast builtins did not previously work on vectors, and have
been made to:
* `@floatCast`
* `@ptrFromInt`
* `@intFromPtr`
* `@floatFromInt`
* `@intFromFloat`
* `@intFromBool`
Resolves: #16267
|
|
Closes #15124
|
|
There are a couple concepts here worth understanding:
Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.
InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.
Additionally:
* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
semantic analysis know whether a union has explicit alignment on any
fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
signature and near-duplicate logic to `typeRequiresComptime`.
- Make opaque types not report comptime-only (this was inconsistent
between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
|
|
Resolves: #16719
|
|
Resolves: #16698
|
|
The main motivation for this change is eliminating the `block_ptr`
result location and corresponding `store_to_block_ptr` ZIR instruction.
This is achieved through a simple pass over the AST before AstGen which
determines, for AST nodes which have a choice on whether to provide a
result location, which choice to make, based on whether the result
pointer is consumed non-trivially.
This eliminates so much logic from AstGen that we almost break even on
line count! AstGen no longer has to worry about instruction rewriting
based on whether or not a result location was consumed: it always knows
what to do ahead of time, which simplifies a *lot* of logic. This also
incidentally fixes a few random AstGen bugs related to result location
handling, leading to the changes in `test/` and `lib/std/`.
This opens the door to future RLS improvements by making them much
easier to implement correctly, and fixes many bugs. Most ZIR is made
more compact after this commit, mainly due to not having redundant
`store_to_block_ptr` instructions lying around, but also due to a few
bugs in the old system which are implicitly fixed here.
|
|
* Generalise NaN handling and make std.math.nan() give quiet NaNs
* Address uses of std.math.qnan_* and std.math.nan_* consts
* Comment out failing test due to issues with signalling NaN
* Fix issue in c_builtins.zig where we need qnan_u32
|
|
The key changes in this commit are:
```diff
- names: []const NullTerminatedString,
+ names: NullTerminatedString.Slice,
- values: []const Index,
+ values: Index.Slice,
```
Which eliminates the slices from `InternPool.Key.EnumType` and replaces
them with structs that contain `start` and `len` indexes. This makes the
lifetime of `EnumType` change from expiring with updates to InternPool,
to expiring when the InternPool is garbage-collected, which is currently
never.
This is gearing up for a larger change I started working on locally
which moves union types into InternPool.
As a bonus, I fixed some unnecessary instances of `@as`.
|
|
Some builtin types have a special InternPool index (e.g.
`.type_info_type`) so that AstGen can refer to them before semantic
analysis. Unfortunately, this previously led to a second index existing
to refer to the type once it was resolved, complicating Sema by having
the concept of an "unresolved" type index.
This change makes Sema modify these InternPool indices in-place to
contain the expanded representation when resolved. The analysis of the
corresponding decls is caught in `Module.semaDecl`, and a field is set
on Sema telling it which index to place struct/union/enum types at. This
system could break if `std.builtin` contained complex decls which
evaluate multiple struct types, but this will be caught by the
assertions in `InternPool.resolveBuiltinType`.
The AstGen result types which were disabled in 6917a8c have been
re-enabled.
Resolves: #16603
|
|
This type not being resolved was a bug which was being triggered by the
next commit.
|
|
The panic handler decl_val was previously given a `unneeded` source
location, which was then added to the reference trace, resulting in a
crash if the source location was used in the reference trace. This
commit makes two trivial changes:
* Don't add unneeded source locations to the ref table (panic in debug, silently ignore in release)
* Pass a real source location when analyzing the panic handler
|
|
After ff37ccd, interned values are trivial to convert to Air refs, using
`Air.internedToRef`. This made functions like `Sema.addConstant` effectively
redundant. This commit removes `Sema.addConstant` and `Sema.addType`, replacing
them with direct usages of `Air.internedToRef`.
Additionally, a new helper `Module.undefValue` is added, and the following
functions are moved into Module:
* `Sema.addConstUndef` -> `Module.undefRef`
* `Sema.addUnsignedInt` -> `Module.intRef` (now also works for signed types)
The general pattern here is that any `Module.xyzValue` helper may also have a
corresponding `Module.xyzRef` helper, which just wraps the call in
`Air.internedToRef`.
|
|
Closes #16744
|
|
as explainded at https://llvm.org/docs/LangRef.html#vector-type :
"In general vector elements are laid out in memory in the same way as array types.
Such an analogy works fine as long as the vector elements are byte sized.
However, when the elements of the vector aren’t byte sized it gets a bit more complicated.
One way to describe the layout is by describing what happens when a vector such
as <N x iM> is bitcasted to an integer type with N*M bits, and then following the
rules for storing such an integer to memory."
"When <N*M> isn’t evenly divisible by the byte size the exact memory layout
is unspecified (just like it is for an integral type of the same size)."
|