std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566) - zig - General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software. https://ziglang.org

diff options

author	Frank Denis <124872+jedisct1@users.noreply.github.com>	2022-11-17 13:07:07 +0100
committer	GitHub <noreply@github.com>	2022-11-17 13:07:07 +0100
commit	7cfeae1ce7aa9f1b3a219d032c43bc2e694ba63b (patch)
tree	7818e427398bef4e3415a095db5bfe600fbd22fc /src/stage1
parent	58d9004cea5f8aa73c76382cd21e1c88b1bc21e1 (diff)
download	zig-7cfeae1ce7aa9f1b3a219d032c43bc2e694ba63b.tar.gz zig-7cfeae1ce7aa9f1b3a219d032c43bc2e694ba63b.zip

std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)

* std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs Carryless multiplication was slow on older Intel CPUs, justifying the need for using Karatsuba multiplication. This is not the case any more; using 4 multiplications to multiply two 128-bit numbers is actually faster than 3 multiplications + shifts and additions. This is also true on aarch64. Keep using Karatsuba only when targeting x86 (granted, this is a bit of a brutal shortcut, we should really list all the CPU models that had a slow clmul instruction). Also remove useless agg_2 treshold and restore the ability to precompute only H and H^2 in ReleaseSmall. Finally, avoid using u256. Using 128-bit registers is actually faster. * Use a switch, add some comments

Diffstat (limited to 'src/stage1')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: