aboutsummaryrefslogtreecommitdiff
path: root/src/codegen/x86_64
AgeCommit message (Collapse)Author
2025-11-20update deprecated ArrayListUnmanaged usage (#25958)Benjamin Jurk
2025-11-20Merge pull request #25898 from jacobly0/elfv2-progressAndrew Kelley
Elf2: more progress
2025-11-15Legalize: implement soft-float legalizationsMatthew Lugg
A new `Legalize.Feature` tag is introduced for each float bit width (16/32/64/80/128). When e.g. `soft_f16` is enabled, all arithmetic and comparison operations on `f16` are converted to calls to the appropriate compiler_rt function using the new AIR tag `.legalize_compiler_rt_call`. This includes casts where the source *or* target type is `f16`, or integer<=>float conversions to or from `f16`. Occasionally, operations are legalized to blocks because there is extra code required; for instance, legalizing `@floatFromInt` where the integer type is larger than 64 bits requires calling an arbitrary-width integer conversion function which accepts a pointer to the integer, so we need to use `alloc` to create such a pointer, and store the integer there (after possibly zero-extending or sign-extending it). No backend currently uses these new legalizations (and as such, no backend currently needs to implement `.legalize_compiler_rt_call`). However, for testing purposes, I tried modifying the self-hosted x86_64 backend to enable all of the soft-float features (and implement the AIR instruction). This modified backend was able to pass all of the behavior tests (except for one `@mod` test where the LLVM backend has a bug resulting in incorrect compiler-rt behavior!), including the tests specific to the self-hosted x86_64 backend. `f16` and `f80` legalizations are likely of particular interest to backend developers, because most architectures do not have instructions to operate on these types. However, enabling *all* of these legalization passes can be useful when developing a new backend to hit the ground running and pass a good amount of tests more easily.
2025-11-12Air.Legalize: revert to loops for scalarizationsMatthew Lugg
I had tried unrolling the loops to avoid requiring the `vector_store_elem` instruction, but it's arguably a problem to generate O(N) code for an operation on `@Vector(N, T)`. In addition, that lowering emitted a lot of `.aggregate_init` instructions, which is itself a quite difficult operation to codegen. This requires reintroducing runtime vector indexing internally. However, I've put it in a couple of instructions which are intended only for use by `Air.Legalize`, named `legalize_vec_elem_val` (like `array_elem_val`, but for indexing a vector with a runtime-known index) and `legalize_vec_store_elem` (like the old `vector_store_elem` instruction). These are explicitly documented as *not* being emitted by Sema, so need only be implemented by backends if they actually use an `Air.Legalize.Feature` which emits them (otherwise they can be marked as `unreachable`).
2025-11-12x86_64: spill eflags when initializing bool vectorMatthew Lugg
2025-11-12compiler: spring cleaningMatthew Lugg
I started this diff trying to remove a little dead code from the C backend, but ended up finding a bunch of dead code sprinkled all over the place: * `packed` handling in the C backend which was made dead by `Legalize` * Representation of pointers to runtime-known vector indices * Handling for the `vector_store_elem` AIR instruction (now removed) * Old tuple handling from when they used the InternPool repr of structs * Straightforward unused functions * TODOs in the LLVM backend for features which Zig just does not support
2025-11-11Elf2: implement PLTJacob Young
2025-11-04x86_64: implement split vector storesJacob Young
Closes #25809
2025-10-30Merge pull request #25558 from jacobly0/elfv2-load-objJacob Young
Elf2: start implementing input object loading
2025-10-29x86_64: add `lret` encodingJacob Young
Closes #25608
2025-10-29x86_64: continue hacking around unimplemented linker logicJacob Young
Closes #25666
2025-10-29x86_64: fix encoding for out with an immediate portJacob Young
Closes #25547
2025-10-29Elf2: start implementing dynamic linkingJacob Young
2025-10-29Elf2: load relocations from input objectsJacob Young
2025-10-10Coff: implement threadlocal variablesJacob Young
2025-10-03x86_64: fix bool vector init register clobberJacob Young
Closes #25439
2025-10-02Coff: deleteJacob Young
2025-10-02Coff2: create a new linker from scratchJacob Young
2025-10-02x86_64: fix windows calling convention abiJacob Young
2025-09-27x86_64: fix `@mulAdd` miscompJacob Young
2025-09-27x86_64: fix `~`/`!` miscompsJacob Young
2025-09-27x86_64: fix `@floatFromInt` miscompsJacob Young
2025-09-27x86_64: fix unencodable `rem` loweringsmlugg
The memory operand might use one of the extended GPRs R8 through R15 and hence require a REX prefix, but having a REX prefix makes the high-byte register AH unencodeable as the src operand. This latent bug was exposed by this branch, presumably because `select` now happens to be putting something in an extended GPR instead of a legacy GPR. In theory this could be fixed with minimal cost by introducing a way to communicate to `select` that neither the destination memory nor the other temporary can be in an extended GPR. However, I just went for the simple solution which comes at a cost of one trivial instruction: copy the remainder from AH to AL, and *then* copy AL to the destination.
2025-09-27x86_64: fix miscompilation of `mul` on vectors of large intsmlugg
2025-09-27x86_64: generate better constant memcpy codemlugg
`rep movsb` isn't usually a great idea here. This commit makes the logic which tentatively existed in `genInlineMemcpy` apply in more cases, and in particular applies it to the "new" backend logic. Put simply, all copies of 128 bytes or fewer will now attempt this path first, where---provided there is an SSE register and/or a general-purpose register available---we will lower the operation using a sequence of 32, 16, 8, 4, 2, and 1 byte copy operations. The feedback I got on this diff was "Push it to master and if it miscomps I'll revert it" so don't blame me when it explodes
2025-09-26compiler: move self-hosted backends from src/arch to src/codegenAlex Rønne Petersen