| Age | Commit message (Collapse) | Author |
|
spirv: push constants and small fixes
|
|
|
|
- Rename GPU address spaces to match with SPIR-V spec.
- Emit `Block` Decoration for Uniform/PushConstant variables.
- Don't emit `OpTypeForwardPointer` for non-opencl targets.
(there's still a false-positive about recursive structs)
Signed-off-by: Ali Cheraghi <alichraghi@proton.me>
|
|
This commit reworks how anonymous struct literals and tuples work.
Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.
Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.
This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.
Resolves: #16865
|
|
|
|
We can use real pointers with this storage class!!
|
|
* Fragment and Vertex CCs are only valid for SPIR-V when
running under Vulkan.
* Emit GLCompute instead of Kernel for SPIR-V kernels.
|
|
This commit begins implementing accepted proposal #21209 by making
`std.builtin.CallingConvention` a tagged union.
The stage1 dance here is a little convoluted. This commit introduces the
new type as `NewCallingConvention`, keeping the old `CallingConvention`
around. The compiler uses `std.builtin.NewCallingConvention`
exclusively, but when fetching the type from `std` when running the
compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is
used. This allows a prior build of Zig to be used to build this commit.
The next commit will update `zig1.wasm`, and then the compiler and
standard library can be updated to completely replace
`CallingConvention` with `NewCallingConvention`.
The second half of #21209 is to remove `@setAlignStack`, which will be
implemented in another commit after updating `zig1.wasm`.
|
|
It seems that these are now automatically added to AIR in Sema.
|
|
|
|
|
|
* Adds new cpu architectures propeller1 and propeller2.
These cpu architectures allow targeting the Parallax Propeller 1 and Propeller 2, which are both very special microcontrollers with 512 registers and 8 cpu cores.
Resolves #21559
* Adds std.elf.EM.PROPELLER and std.elf.EM.PROPELLER2
* Fixes missing switch prongs in src/codegen/llvm.zig
* Fixes order in std.Target.Arch
---------
Co-authored-by: Felix "xq" Queißner <git@random-projects.net>
|
|
|
|
This commit introduces a new AIR instruction, `repeat`, which causes
control flow to move back to the start of a given AIR loop. `loop`
instructions will no longer automatically perform this operation after
control flow reaches the end of the body.
The motivation for making this change now was really just consistency
with the upcoming implementation of #8220: it wouldn't make sense to
have this feature work significantly differently. However, there were
already some TODOs kicking around which wanted this feature. It's useful
for two key reasons:
* It allows loops over AIR instruction bodies to loop precisely until
they reach a `noreturn` instruction. This allows for tail calling a
few things, and avoiding a range check on each iteration of a hot
path, plus gives a nice assertion that validates AIR structure a
little. This is a very minor benefit, which this commit does apply to
the LLVM and C backends.
* It should allow for more compact ZIR and AIR to be emitted by having
AstGen emit `repeat` instructions more often rather than having
`continue` statements `break` to a `block` which is *followed* by a
`repeat`. This is done in status quo because `repeat` instructions
only ever cause the direct parent block to repeat. Now that AIR is
more flexible, this flexibility can be pretty trivially extended to
ZIR, and we can then emit better ZIR. This commit does not implement
this.
Support for this feature is currently regressed on all self-hosted
native backends, including x86_64. This support will be added where
necessary before this branch is merged.
|
|
This commit modifies the representation of the AIR `switch_br`
instruction to represent ranges in cases. Previously, Sema emitted
different AIR in the case of a range, where the `else` branch of the
`switch_br` contained a simple `cond_br` for each such case which did a
simple range check (`x > a and x < b`). Not only does this add
complexity to Sema, which we would like to minimize, but it also gets in
the way of the implementation of #8220. That proposal turns certain
`switch` statements into a looping construct, and for optimization
purposes, we want to lower this to AIR fairly directly (i.e. without
involving a `loop` instruction). That means we would ideally like a
single instruction to represent the entire `switch` statement, so that
we can dispatch back to it with a different operand as in #8220. This is
not really possible to do correctly under the status quo system.
This commit implements lowering of this new `switch_br` usage in the
LLVM and C backends. The C backend just turns any case containing ranges
entirely into conditionals, as before. The LLVM backend is a little
smarter, and puts scalar items into the `switch` instruction, only using
conditionals for the range cases (which direct to the same bb). All
remaining self-hosted backends are temporarily regressed in the presence
of switch range cases. This functionality will be restored for at least
the x86_64 backend before merge.
|
|
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.
This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
|
|
|
|
Implements the accepted proposal to introduce `@branchHint`. This
builtin is permitted as the first statement of a block if that block is
the direct body of any of the following:
* a function (*not* a `test`)
* either branch of an `if`
* the RHS of a `catch` or `orelse`
* a `switch` prong
* an `or` or `and` expression
It lowers to the ZIR instruction `extended(branch_hint(...))`. When Sema
encounters this instruction, it sets `sema.branch_hint` appropriately,
and `zirCondBr` etc are expected to reset this value as necessary. The
state is on `Sema` rather than `Block` to make it automatically
propagate up non-conditional blocks without special handling. If
`@panic` is reached, the branch hint is set to `.cold` if none was
already set; similarly, error branches get a hint of `.unlikely` if no
hint is explicitly provided. If a condition is comptime-known, `cold`
hints from the taken branch are allowed to propagate up, but other hints
are discarded. This is because a `likely`/`unlikely` hint just indicates
the direction this branch is likely to go, which is redundant
information when the branch is known at comptime; but `cold` hints
indicate that control flow is unlikely to ever reach this branch,
meaning if the branch is always taken from its parent, then the parent
is also unlikely to ever be reached.
This branch information is stored in AIR `cond_br` and `switch_br`. In
addition, `try` and `try_ptr` instructions have variants `try_cold` and
`try_ptr_cold` which indicate that the error case is cold (rather than
just unlikely); this is reachable through e.g. `errdefer unreachable` or
`errdefer @panic("")`.
A new API `unwrapSwitch` is introduced to `Air` to make it more
convenient to access `switch_br` instructions. In time, I plan to update
all AIR instructions to be accessed via an `unwrap` method which returns
a convenient tagged union a la `InternPool.indexToKey`.
The LLVM backend lowers branch hints for conditional branches and
switches as follows:
* If any branch is marked `unpredictable`, the instruction is marked
`!unpredictable`.
* Any branch which is marked as `cold` gets a
`llvm.assume(i1 true) [ "cold"() ]` call to mark the code path cold.
* If any branch is marked `likely` or `unlikely`, branch weight metadata
is attached with `!prof`. Likely branches get a weight of 2000, and
unlikely branches a weight of 1. In `switch` statements, un-annotated
branches get a weight of 1000 as a "middle ground" hint, since there
could be likely *and* unlikely *and* un-annotated branches.
For functions, a `cold` hint corresponds to the `cold` function
attribute, and other hints are currently ignored -- as far as I can tell
LLVM doesn't really have a way to lower them. (Ideally, we would want
the branch hint given in the function to propagate to call sites.)
The compiler and standard library do not yet use this new builtin.
Resolves: #21148
|
|
My main gripes with this design were that it was incorrectly namespaced, the naming was inconsistent and a bit wrong (`fooAlign` vs `fooAlignment`).
This commit moves all the logic from `PerThread.zig` to use the zcu + tid system that the previous couple commits introduce.
I've organized and merged the functions to be a bit more specific to their own purpose.
- `fieldAlignment` takes a struct or union type, an index, and a Zcu (or the Sema version which takes a Pt), and gives you the alignment of the field at the index.
- `structFieldAlignment` takes the field type itself, and provides the logic to handle special cases, such as externs.
A design goal I had in mind was to avoid using the word 'struct' in the function name, when it worked for things that aren't structs, such as unions.
|
|
|
|
|
|
The type `Zcu.Decl` in the compiler is problematic: over time it has
gained many responsibilities. Every source declaration, container type,
generic instantiation, and `@extern` has a `Decl`. The functions of
these `Decl`s are in some cases entirely disjoint.
After careful analysis, I determined that the two main responsibilities
of `Decl` are as follows:
* A `Decl` acts as the "subject" of semantic analysis at comptime. A
single unit of analysis is either a runtime function body, or a
`Decl`. It registers incremental dependencies, tracks analysis errors,
etc.
* A `Decl` acts as a "global variable": a pointer to it is consistent,
and it may be lowered to a specific symbol by the codegen backend.
This commit eliminates `Decl` and introduces new types to model these
responsibilities: `Cau` (Comptime Analysis Unit) and `Nav` (Named
Addressable Value).
Every source declaration, and every container type requiring resolution
(so *not* including `opaque`), has a `Cau`. For a source declaration,
this `Cau` performs the resolution of its value. (When #131 is
implemented, it is unsolved whether type and value resolution will share
a `Cau` or have two distinct `Cau`s.) For a type, this `Cau` is the
context in which type resolution occurs.
Every non-`comptime` source declaration, every generic instantiation,
and every distinct `extern` has a `Nav`. These are sent to codegen/link:
the backends by definition do not care about `Cau`s.
This commit has some minor technically-breaking changes surrounding
`usingnamespace`. I don't think they'll impact anyone, since the changes
are fixes around semantics which were previously inconsistent (the
behavior changed depending on hashmap iteration order!).
Aside from that, this changeset has no significant user-facing changes.
Instead, it is an internal refactor which makes it easier to correctly
model the responsibilities of different objects, particularly regarding
incremental compilation. The performance impact should be negligible,
but I will take measurements before merging this work into `master`.
Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com>
Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>
|
|
|
|
|
|
This allows the mutate mutex to only be locked during actual grows,
which are rare. For the lists that didn't previously have a mutex, this
change has little effect since grows are rare and there is zero
contention on a mutex that is only ever locked by one thread. This
change allows `extra` to be mutated without racing with a grow.
|
|
|
|
This avoids needing to mutate the intern pool from backends.
|
|
|
|
|
|
Primarily, this commit removes 2 fields from File, relying on the data
being stored in the `files` field, with the key as the path digest, and
the value as the struct decl corresponding to the File. This table is
serialized into the compiler state that survives between incremental
updates.
Meanwhile, the File struct remains ephemeral data that can be
reconstructed the first time it is needed by the compiler process, as
well as operated on by independent worker threads.
A key outcome of this commit is that there is now a stable index that
can be used to refer to a File. This will be needed when serializing
error messages to survive incremental compilation updates.
|
|
|
|
This change modifies `Zcu.ErrorMsg` to store a `Zcu.LazySrcLoc` rather
than a `Zcu.SrcLoc`. Everything else is dominoes.
The reason for this change is incremental compilation. If a failed
`AnalUnit` is up-to-date on an update, we want to re-use the old error
messages. However, the file containing the error location may have been
modified, and `SrcLoc` cannot survive such a modification. `LazySrcLoc`
is designed to be correct across incremental updates. Therefore, we
defer source location resolution until `Compilation` gathers the compile
errors into the `ErrorBundle`.
|
|
This change seeks to more appropriately model the way semantic analysis
works by drawing a more clear line between errors emitted by analyzing a
`Decl` (in future a `Cau`) and errors emitted by analyzing a runtime
function.
This does change a few compile errors surrounding compile logs by adding
more "also here" notes. The new notes are more technically correct, but
perhaps not so helpful. They're not doing enough harm for me to put
extensive thought into this for now.
|
|
We need special logic for updating line numbers anyway, so it's fine to
just use absolute numbers here. This eliminates a field from `Decl`.
|
|
This patch is a pure rename plus only changing the file path in
`@import` sites, so it is expected to not create version control
conflicts, even when rebasing.
|
|
The Great Decl Split (preliminary work): refactor source locations and eliminate `Sema.Block.src_decl`.
|
|
|
|
`LazySrcLoc` now stores a reference to the "base AST node" to which it
is relative. The previous tagged union is `LazySrcLoc.Offset`. To make
working with this structure convenient, `Sema.Block` contains a
convenience `src` method which takes an `Offset` and returns a
`LazySrcLoc`.
The "base node" of a source location is no longer given by a `Decl`, but
rather a `TrackedInst` representing either a `declaration`,
`struct_decl`, `union_decl`, `enum_decl`, or `opaque_decl`. This is a
more appropriate model, and removes an unnecessary responsibility from
`Decl` in preparation for the upcoming refactor which will split it into
`Nav` and `Cau`.
As a part of these `Decl` reworks, the `src_node` field is eliminated.
This change aids incremental compilation, and simplifies `Decl`. In some
cases -- particularly in backends -- the source location of a
declaration is desired. This was previously `Decl.srcLoc` and worked for
any `Decl`. Now, it is `Decl.navSrcLoc` in reference to the upcoming
refactor, since the set of `Decl`s this works for precisely corresponds
to what will in future become a `Nav` -- that is, source-level
declarations and generic function instantiations, but *not* type owner
Decls.
This commit introduces more tags to `LazySrcLoc.Offset` so as to
eliminate the concept of `error.NeededSourceLocation`. Now, `.unneeded`
should only be used to assert that an error path is unreachable. In the
future, uses of `.unneeded` can probably be replaced with `undefined`.
The `src_decl` field of `Sema.Block` no longer has a role in type
resolution. Its main remaining purpose is to handle namespacing of type
names. It will be eliminated entirely in a future commit to remove
another undue responsibility from `Decl`.
It is worth noting that in future, the `Zcu.SrcLoc` type should probably
be eliminated entirely in favour of storing `Zcu.LazySrcLoc` values.
This is because `Zcu.SrcLoc` is not valid across incremental updates,
and we want to be able to reuse error messages from previous updates
even if the source file in question changed. The error reporting logic
should instead simply resolve the location from the `LazySrcLoc` on the
fly.
|
|
This is in preparation for some upcoming changes to how we represent
source locations in the compiler. The bulk of the change here is dealing
with the removal of `src()` methods from `Zir` types.
|
|
|
|
The old vectorization helper (WipElementWise) was clunky and a bit
annoying to use, and it wasn't really flexible enough.
This introduces a new vectorization helper, which uses Temporary and
Operation types to deduce a Vectorization to perform the operation
in a reasonably efficient manner. It removes the outer loop
required by WipElementWise so that implementations of AIR instructions
are cleaner. This helps with sanity when we start to introduce support
for composite integers.
airShift, convertToDirect, convertToIndirect, and normalize are initially
implemented using this new method.
|
|
Now that we use POCL to test, we no longer need this ✨
|
|
Previously the child type of a vector was always in indirect representation.
Concretely, this meant that vectors of bools are represented by vectors
of u8.
This was undesirable because it introduced a difference between vectorizable
operations with a scalar bool and a vector of bool. This commit changes the
representation to be the same for vectors and scalars everywhere.
Some issues arised with constructing vectors: it seems the previous temporary-
and-pointer approach does not work properly with vectors of bool. To work around
this, simply use OpCompositeConstruct. This is the proper instruction for this,
but it was previously not used because of a now-solved limitation in the
SPIRV-LLVM-Translator. It was not yet applied to Zig because the Intel OpenCL
CPU runtime does not have a recent enough version of the translator yet, but
to solve that we just switch to testing with POCL instead.
|
|
We've got a big one here! This commit reworks how we represent pointers
in the InternPool, and rewrites the logic for loading and storing from
them at comptime.
Firstly, the pointer representation. Previously, pointers were
represented in a highly structured manner: pointers to fields, array
elements, etc, were explicitly represented. This works well for simple
cases, but is quite difficult to handle in the cases of unusual
reinterpretations, pointer casts, offsets, etc. Therefore, pointers are
now represented in a more "flat" manner. For types without well-defined
layouts -- such as comptime-only types, automatic-layout aggregates, and
so on -- we still use this "hierarchical" structure. However, for types
with well-defined layouts, we use a byte offset associated with the
pointer. This allows the comptime pointer access logic to deal with
reinterpreted pointers far more gracefully, because the "base address"
of a pointer -- for instance a `field` -- is a single value which
pointer accesses cannot exceed since the parent has undefined layout.
This strategy is also more useful to most backends -- see the updated
logic in `codegen.zig` and `codegen/llvm.zig`. For backends which do
prefer a chain of field and elements accesses for lowering pointer
values, such as SPIR-V, there is a helpful function in `Value` which
creates a strategy to derive a pointer value using ideally only field
and element accesses. This is actually more correct than the previous
logic, since it correctly handles pointer casts which, after the dust
has settled, end up referring exactly to an aggregate field or array
element.
In terms of the pointer access code, it has been rewritten from the
ground up. The old logic had become rather a mess of special cases being
added whenever bugs were hit, and was still riddled with bugs. The new
logic was written to handle the "difficult" cases correctly, the most
notable of which is restructuring of a comptime-only array (for
instance, converting a `[3][2]comptime_int` to a `[2][3]comptime_int`.
Currently, the logic for loading and storing work somewhat differently,
but a future change will likely improve the loading logic to bring it
more in line with the store strategy. As far as I can tell, the rewrite
has fixed all bugs exposed by #19414.
As a part of this, the comptime bitcast logic has also been rewritten.
Previously, bitcasts simply worked by serializing the entire value into
an in-memory buffer, then deserializing it. This strategy has two key
weaknesses: pointers, and undefined values. Representations of these
values at comptime cannot be easily serialized/deserialized whilst
preserving data, which means many bitcasts would become runtime-known if
pointers were involved, or would turn `undefined` values into `0xAA`.
The new logic works by "flattening" the datastructure to be cast into a
sequence of bit-packed atomic values, and then "unflattening" it; using
serialization when necessary, but with special handling for `undefined`
values and for pointers which align in virtual memory. The resulting
code is definitely slower -- more on this later -- but it is correct.
The pointer access and bitcast logic required some helper functions and
types which are not generally useful elsewhere, so I opted to split them
into separate files `Sema/comptime_ptr_access.zig` and
`Sema/bitcast.zig`, with simple re-exports in `Sema.zig` for their small
public APIs.
Whilst working on this branch, I caught various unrelated bugs with
transitive Sema errors, and with the handling of `undefined` values.
These bugs have been fixed, and corresponding behavior test added.
In terms of performance, I do anticipate that this commit will regress
performance somewhat, because the new pointer access and bitcast logic
is necessarily more complex. I have not yet taken performance
measurements, but will do shortly, and post the results in this PR. If
the performance regression is severe, I will do work to to optimize the
new logic before merge.
Resolves: #19452
Resolves: #19460
|
|
This deletes a ton of lookups and avoids many UAF bugs.
Closes #19485
|
|
|
|
|
|
This allows us to more sanely allocate a continuous
range of result-ids, and avoids a bunch of nasty
casting code in a few places. Its currently not used
very often, but will be useful in the future.
|
|
|
|
|