x86_64.Lower: replace slow stringToEnum call - zig - General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software. https://ziglang.org

diff options

author	mlugg <mlugg@mlugg.co.uk>	2025-06-08 17:58:46 +0100
committer	mlugg <mlugg@mlugg.co.uk>	2025-06-12 18:40:01 +0100
commit	43d01ff69f6c6c46bef81dd4de2c78fb0a942b65 (patch)
tree	f369cbfe7c6b9467374169f002a4394222d8d672 /src/link/Queue.zig
parent	71baa5e769b3b82468736a60e0725a94da9be4e9 (diff)
download	zig-43d01ff69f6c6c46bef81dd4de2c78fb0a942b65.tar.gz zig-43d01ff69f6c6c46bef81dd4de2c78fb0a942b65.zip

x86_64.Lower: replace slow stringToEnum call

Looking at a compilation of 'test/behavior/x86_64/unary.zig' in callgrind showed that a full 30% of the compiler runtime was spent in this `stringToEnum` call, so optimizing it was low-hanging fruit. We tried replacing it with nested `switch` statements using `inline else`, but that generated too much code; it didn't emit huge binaries or anything, but LLVM used a *ridiculous* amount of memory compiling it in some cases. The core problem here is that only a small subset of the cases are actually used (the rest fell through to an "error" path), but that subset is computed at comptime, so we must rely on the optimizer to eliminate the thousands of redundant cases. This would be solved by #21507. Instead, we pre-compute a lookup table at comptime. This table is pretty big (I guess a couple hundred k?), but only the "valid" subset of entries will be accessed in practice (unless a bug in the backend is hit), so it's not too awful on the cache; and it performs much better than the old `std.meta.stringToEnum` call.

Diffstat (limited to 'src/link/Queue.zig')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: