aboutsummaryrefslogtreecommitdiff
path: root/src/AstGen.zig
AgeCommit message (Collapse)Author
2024-02-26move AstGen to std.zig.AstGenAndrew Kelley
Part of an effort to ship more of the compiler in source form.
2024-02-26move Zir to std.zig.ZirAndrew Kelley
Part of an effort to ship more of the compiler in source form.
2024-02-26compiler: decide dbg_var scoping based on AIR blocksmlugg
This commit eliminates the `dbg_block_{begin,end}` instructions from both ZIR and AIR. Instead, lexical scoping of `dbg_var_{ptr,val}` instructions is decided based on the AIR block they exist within. This is a much more robust system, and also results in a huge drop in ZIR bytes - around 7% for Sema.zig. This required some enhancements to Sema to prevent elision of blocks when they are required for debug variable scoping. This can be observed by looking at the AIR for the following simple test program with and without `-fstrip`: ```zig export fn f() void { { var a: u32 = 0; _ = &a; } { var a: u32 = 0; _ = &a; } } ``` When `-fstrip` is passed, no AIR blocks are generated. When `-fno-strip` is passed, the ZIR blocks are lowered to true AIR blocks to give correct lexical scoping to the debug vars. The changes here incidentally reolve #19060. A corresponding behavior test has been added. Resolves: #19060
2024-02-26AstGen: fix OoB crash on `ast-check -t`John Schmidt
The `decl_node` was offset from the wrong source node.
2024-02-22Module: fix `@embedFile` of files containing zero bytesJacob Young
If an adapted string key with embedded nulls was put in a hash map with `std.hash_map.StringIndexAdapter`, then an incorrect hash would be entered for that entry such that it is possible that when looking for the exact key that matches the prefix of the original key up to the first null would sometimes match this entry due to hash collisions and sometimes not if performed later after a grow + rehash, causing the same key to exist with two different indices breaking every string equality comparison ever, for example claiming that a container type doesn't contain a field because the field name string in the struct and the string representing the identifier to lookup might be equal strings but have different string indices. This could maybe be fixed by changing `std.hash_map.StringIndexAdapter.hash` to only hash up to the first null, therefore ensuring that the entry's hash is correct and that all future lookups will be consistent, but I don't trust anything so instead I assert that there are no embedded nulls.
2024-02-16Zir: make src_node of type declarations non-optionalmlugg
Previously, the `src_node` field of `struct_decl`, `union_decl`, `enum_decl`, and `opaque_decl` was optional, included in trailing data only if a flag in `Small` was set. However, this was unnecessary logic: AstGen always provided the source node. We can simplify a few bits of logic by making this field non-optional, moving it into non-trailing data. There was one place where the field was actually omitted before: the root struct of a file was at source node 0, so the node was coincidentally elided. Therefore, this commit has a fixed cost of 4 bytes of ZIR per file.
2024-02-16AstGen: migrate `ty` result locations to `coerced_ty`mlugg
In most cases where AstGen is coercing to a fixed type (such as `u29`, `type`, `std.builtin.CallingConvention) we do not necessarily require an explicit coercion instruction. Instead, Sema knows the type that is required, and can perform the coercion after the fact. This means we can use the `coerced_ty` result location kind, saving unnecessary coercion instructions and therefore ZIR bytes. This required a few enhancements to Sema to introduce missing coercions.
2024-02-16Sema: eliminate `src` fieldmlugg
`sema.src` is a failed experiment. It introduces complexity, and makes often unwarranted assumptions about the existence of instructions providing source locations, requiring an unreasonable amount of caution in AstGen for correctness. Eliminating it simplifies the whole frontend. This required adding source locations to a few instructions, but the cost in ZIR bytes should be counteracted by the other work on this branch.
2024-02-16AstGen: fix elision of unnecessary `dbg_stmt` instructionsmlugg
AstGen has logic to elide leading `dbg_stmt` instructions when multiple are emitted consecutively; however, it only applied in some cases. A simple reshuffle here makes this logic apply universally, saving some bytes in ZIR.
2024-02-16AstGen: avoid emitting multiple `ret_type` instructionsmlugg
This is a small optimization to generated ZIR. In any function where the return type is not a trivial Ref, we know it is almost certainly not `void` (unless the user aliased it or did something else weird to fool AstGen), and thus the return type is very likely to be required for return value RLS at some point. Thus, we can just emit one `ret_type` at the start of the function and use it throughout. This sees a very small improvement in overall ZIR bytes.
2024-02-04Zir: store extra source hashes required for incrementalmlugg
Also add corresponding invaidation logic to Zcu. Therefore, the only invalidation logic which is not yet in place is `decl_val` dependencies.
2024-01-23Zir: represent declarations via an instructionmlugg
This commit changes how declarations (`const`, `fn`, `usingnamespace`, etc) are represented in ZIR. Previously, these were represented in the container type's extra data (e.g. as trailing data on a `struct_decl`). However, this introduced the complexity of the ZIR mapping logic having to also correlate some ZIR extra data indices. That isn't really a problem today, but it's tricky for the introduction of `TrackedInst` in the commit following this one. Instead, these type declarations now simply contain a trailing list of ZIR indices to `declaration` instructions, which directly encode all data related to the declaration (including containing the declaration's body). Additionally, the ZIR for `align` etc have been split out into their own bodies. This is not strictly necessary, but it's much simpler to understand for an insignificant cost in bytes, and will simplify the resolution of #131 (where we may need to evaluate the pointer type, including align etc, without immediately evaluating the value body).
2024-01-20AstGen: detect duplicate field namesDavid Rubin
This logic was previously in Sema, which was unnecessary complexity, and meant the issue was not detected unless the declaration was semantically analyzed. This commit finishes the work which 941090d started. Resolves: #17916
2024-01-18astgen: fix error return trace on error union switchdweiller
2024-01-16AstGen: use correct token_src for switch, if and while exprstravisstaloch
fixes #18579
2024-01-16AstGen: properly handle ill-formed switch on errorTechatrix
2024-01-16AstGen: add error message for capture error by ref in switch on errorTechatrix
2024-01-09AstGen: add error for redundant comptime var in comptime scope (#18242)Bogdan Romanyuk
2024-01-09fixup! astgen: use switch_block_err_uniondweiller
2024-01-09astgen/sema: fix source locations for switch_block_err_uniondweiller
2024-01-09astgen/sema: use switch_block_err_union for if-else-switchdweiller
2024-01-09fix x86_64 crashes for switch_block_err_uniondweiller
This change only emits the unwrap_errunion_err instruction if the error capture is actually used in a branch.
2024-01-09astgen: use switch_block_err_uniondweiller
2024-01-09zir: add switch_block_err_uniondweiller
2024-01-09zir: remove unused zir as instructiondweiller
2024-01-08add type safety to ZIR for null terminated stringsAli Chraghi
2024-01-03cbe: fix non-msvc externs and exportsJacob Young
Closes #17817
2023-12-08AstGen: add error for using inline loops in comptime only scopesVeikka Tuominen
2023-11-26AstGen: check allowed non-function builtins with declarative field (#18120)Bogdan Romanyuk
2023-11-25Compiler: move checking function-scope-only builtins to AstGenBogdan Romanyuk
2023-11-24frontend: move AstRlAnnotate to std.zig namespaceMeghan Denny
2023-11-24frontend: move BuiltinFn to std.zig namespaceMeghan Denny
2023-11-24AstGen: remove calls to tracyMeghan Denny
2023-11-19AstGen: preserve result type in comptime blockmlugg
2023-11-19compiler: correct unnecessary uses of 'var'mlugg
2023-11-19compiler: add error for unnecessary use of 'var'mlugg
When a local variable is never used as an lvalue, we can determine that `const` would be sufficient for this variable, so emit an error in this case. More sophisticated checking is unfortunately not possible with Zig's current analysis model, since whether an lvalue is actually mutated depends on semantic analysis, in which some code paths may not be analyzed, so attempting to determine this would result in false positive compile errors. It's worth noting that an unfortunate consequence of this is that any field call `a.b()` will allow `a` to be `var`, even if `b` does not take a pointer as its first parameter - this is again a necessary compromise because the parameter type is not known until semantic analysis. Also update `translate-c` to not trigger these errors. This is done by replacing the `_ = @TypeOf(x)` emitted with `_ = &x` - the reference there means that the local is permitted to be `var`. A similar strategy will be used to prevent compile errors in the behavior tests, where we sometimes want to force a value to be runtime-known. Resolves: #224
2023-11-16Move duplicate field detection for struct init expressions into AstGenDavid
Partially addresses #17916.
2023-11-08Sema: optimize runtime array_mulmlugg
There are two optimizations here, which work together to avoid a pathological case. The first optimization is that AstGen now records the result type of an array multiplication expression where possible. This type is not used according to the language specification, but instead as an optimization. In the expression '.{x} ** 1000', if we know that the result must be an array, then it is much more efficient to coerce the LHS to an array with length 1 before doing the multiplication. Otherwise, we end up with a 1000-element tuple which we must coerce to an array by individually extracting each field. Secondly, the previous logic would repeatedly extract element/field values from the LHS when initializing the result. This is unnecessary: each element must only be extracted once, and the result reused. These changes together give huge improvements to compiler performance on a pathological case: AIR instructions go from 65551 to 15, and total AIR bytes go from 1.86MiB to 264.57KiB. Codegen time spent on this function (in a debug compiler build) goes from minutes to essentially zero. Resolves: #17586
2023-11-07sema: analyze field init bodies in a second passkcbanner
This change allows struct field inits to use layout information of their own struct without causing a circular dependency. `semaStructFields` caches the ranges of the init bodies in the `StructType` trailing data. The init bodies are then resolved by `resolveStructFieldInits`, which is called before the inits are actually required. Within the init bodies, the struct decl's instruction is repurposed to refer to the field type itself. This is to allow us to easily rebuild the inst_map mapping required for the init body instructions to refer to the field type. Thanks to @mlugg for the guidance on this one!
2023-10-28make Zir.Inst.Index typedAndrew Kelley
This commit starts by making Zir.Inst.Index a nonexhaustive enum rather than a u32 alias for type safety purposes, and the rest of the changes are needed to get everything compiling again.
2023-10-23Merge pull request #17651 from Vexu/error-limitAndrew Kelley
Make distinct error limit configurable (attempt #2)
2023-10-22remove uses of non-configurable `err_int`Veikka Tuominen
2023-10-21AstGen: omit make_ptr_const for resolve_inferred_allocmlugg
After the previous commit, these make_ptr_const ZIR instructions are redundant.
2023-10-13drop for loop syntax upgrade mechanismsAndrew Kelley
2023-10-01Sema: add `@errorCast` which works for both error sets and error unionsVeikka Tuominen
Closes #17343
2023-09-27Rename `@fabs` to `@abs` and accept integersantlilja
Replaces the @fabs builtin with a new @abs builtins which accepts floats, signed integers and vectors of said types.
2023-09-23compiler: preserve result type information through address-of operatormlugg
This commit introduces the new `ref_coerced_ty` result type into AstGen. This represents a expression which we want to treat as an lvalue, and the pointer will be coerced to a given type. This change gives known result types to many expressions, in particular struct and array initializations. This allows certain casts to work which previously required explicitly specifying types via `@as`. It also eliminates our dependence on anonymous struct types for expressions of the form `&.{ ... }` - this paves the way for #16865, and also results in less Sema magic happening for such initializations, also leading to potentially better runtime code. As part of these changes, this commit also implements #17194 by disallowing RLS on explicitly-typed struct and array initializations. Apologies for linking these changes - it seemed rather pointless to try and separate them, since they both make big changes to struct and array initializations in AstGen. The rationale for this change can be found in the proposal - in essence, performing RLS whilst maintaining the semantics of the intermediary type is a very difficult problem to solve. This allowed the problematic `coerce_result_ptr` ZIR instruction to be completely eliminated, which in turn also simplified the logic for inferred allocations in Sema - thanks to this, we almost break even on line count! In doing this, the ZIR instructions surrounding these initializations have been restructured - some have been added and removed, and others renamed for clarity (and their semantics changed slightly). In order to optimize ZIR tag count, the `struct_init_anon_ref` and `array_init_anon_ref` instructions have been removed in favour of using `ref` on a standard anonymous value initialization, since these instructions are now virtually never used. Lastly, it's worth noting that this commit introduces a slightly strange source of generic poison types: in the expression `@as(*anyopaque, &x)`, the sub-expression `x` has a generic poison result type, despite no generic code being involved. This turns out to be a logical choice, because we don't know the result type for `x`, and the generic poison type represents precisely this case, providing the semantics we need. Resolves: #16512 Resolves: #17194
2023-09-22AstGen: fix @export with undeclared identifier crashingWooster
This required a third `if (found_already == null)` in another place in AstGen.zig for this special case of `@export`. Fixes #17188
2023-09-21InternPool: implement getStructTypeAndrew Kelley
This also modifies AstGen so that struct types use 1 bit each from the flags to communicate if there are nonzero inits, alignments, or comptime fields. This allows adding a struct type to the InternPool without looking ahead in memory to find out the answers to these questions, which is easier for CPUs as well as for me, coding this logic right now.
2023-09-17AstGen: allow closure over known-runtime values within @TypeOfmlugg
AstGen emits an error when a closure over a known-runtime value crosses a namespace boundary. This usually makes sense: however, this usage is actually valid if the capture is within a `@TypeOf` operand. Sema already has a special case to allow such closure within `@TypeOf` when AstGen could not determine a value to be runtime-known. This commit simply introduces analagous logic to AstGen to allow `var`s to cross namespace boundaries within `@TypeOf`.