| Age | Commit message (Collapse) | Author |
|
To quote the language reference,
It is generally better to let the compiler decide when to inline a
function, except for these scenarios:
* To change how many stack frames are in the call stack, for debugging
purposes.
* To force comptime-ness of the arguments to propagate to the return
value of the function, as in the above example.
* Real world performance measurements demand it. Don't guess!
Note that inline actually restricts what the compiler is allowed to do.
This can harm binary size, compilation speed, and even runtime
performance.
`zig run lib/std/crypto/benchmark.zig -OReleaseFast`
[-before-] vs {+after+}
md5: [-990-] {+998+} MiB/s
sha1: [-1144-] {+1140+} MiB/s
sha256: [-2267-] {+2275+} MiB/s
sha512: [-762-] {+767+} MiB/s
sha3-256: [-680-] {+683+} MiB/s
sha3-512: [-362-] {+363+} MiB/s
shake-128: [-835-] {+839+} MiB/s
shake-256: [-680-] {+681+} MiB/s
turboshake-128: [-1567-] {+1570+} MiB/s
turboshake-256: [-1276-] {+1282+} MiB/s
blake2s: [-778-] {+789+} MiB/s
blake2b: [-1071-] {+1086+} MiB/s
blake3: [-1148-] {+1137+} MiB/s
ghash: [-10044-] {+10033+} MiB/s
polyval: [-9726-] {+10033+} MiB/s
poly1305: [-2486-] {+2703+} MiB/s
hmac-md5: [-991-] {+998+} MiB/s
hmac-sha1: [-1134-] {+1137+} MiB/s
hmac-sha256: [-2265-] {+2288+} MiB/s
hmac-sha512: [-765-] {+764+} MiB/s
siphash-2-4: [-4410-] {+4438+} MiB/s
siphash-1-3: [-7144-] {+7225+} MiB/s
siphash128-2-4: [-4397-] {+4449+} MiB/s
siphash128-1-3: [-7281-] {+7374+} MiB/s
aegis-128x4 mac: [-73385-] {+74523+} MiB/s
aegis-256x4 mac: [-30160-] {+30539+} MiB/s
aegis-128x2 mac: [-66662-] {+67267+} MiB/s
aegis-256x2 mac: [-16812-] {+16806+} MiB/s
aegis-128l mac: [-33876-] {+34055+} MiB/s
aegis-256 mac: [-8993-] {+9087+} MiB/s
aes-cmac: 2036 MiB/s
x25519: [-20670-] {+16844+} exchanges/s
ed25519: [-29763-] {+29576+} signatures/s
ecdsa-p256: [-4762-] {+4900+} signatures/s
ecdsa-p384: [-1465-] {+1500+} signatures/s
ecdsa-secp256k1: [-5643-] {+5769+} signatures/s
ed25519: [-21926-] {+21721+} verifications/s
ed25519: [-51200-] {+50880+} verifications/s (batch)
chacha20Poly1305: [-1189-] {+1109+} MiB/s
xchacha20Poly1305: [-1196-] {+1107+} MiB/s
xchacha8Poly1305: [-1466-] {+1555+} MiB/s
xsalsa20Poly1305: [-660-] {+620+} MiB/s
aegis-128x4: [-76389-] {+78181+} MiB/s
aegis-128x2: [-53946-] {+53495+} MiB/s
aegis-128l: [-27219-] {+25621+} MiB/s
aegis-256x4: [-49351-] {+49542+} MiB/s
aegis-256x2: [-32390-] {+32366+} MiB/s
aegis-256: [-8881-] {+8944+} MiB/s
aes128-gcm: [-6095-] {+6205+} MiB/s
aes256-gcm: [-5306-] {+5427+} MiB/s
aes128-ocb: [-8529-] {+13974+} MiB/s
aes256-ocb: [-7241-] {+9442+} MiB/s
isapa128a: [-204-] {+214+} MiB/s
aes128-single: [-133857882-] {+134170944+} ops/s
aes256-single: [-96306962-] {+96408639+} ops/s
aes128-8: [-1083210101-] {+1073727253+} ops/s
aes256-8: [-762042466-] {+767091778+} ops/s
bcrypt: 0.009 s/ops
scrypt: [-0.018-] {+0.017+} s/ops
argon2: [-0.037-] {+0.060+} s/ops
kyber512d00: [-206057-] {+205779+} encaps/s
kyber768d00: [-156074-] {+150711+} encaps/s
kyber1024d00: [-116626-] {+115469+} encaps/s
kyber512d00: [-181149-] {+182046+} decaps/s
kyber768d00: [-136965-] {+135676+} decaps/s
kyber1024d00: [-101307-] {+100643+} decaps/s
kyber512d00: [-123624-] {+123375+} keygen/s
kyber768d00: [-69465-] {+70828+} keygen/s
kyber1024d00: [-43117-] {+43208+} keygen/s
|
|
Before:
* std.Target.arm.featureSetHas(target.cpu.features, .has_v7)
* std.Target.x86.featureSetHasAny(target.cpu.features, .{ .sse, .avx, .cmov })
* std.Target.wasm.featureSetHasAll(target.cpu.features, .{ .atomics, .bulk_memory })
After:
* target.cpu.has(.arm, .has_v7)
* target.cpu.hasAny(.x86, &.{ .sse, .avx, .cmov })
* target.cpu.hasAll(.wasm, &.{ .atomics, .bulk_memory })
|
|
std.crypto has quite a few instances of breaking naming conventions.
This is the beginning of an effort to address that.
Deprecates `std.crypto.utils`.
|
|
The function returns the vector length, not the byte size of the vector or the bit size of individual elements. This distinction is very important and some usages of this function in the stdlib operated under these incorrect assumptions.
|
|
|
|
Let's take this breaking change opportunity to fix the style of this
enum.
|
|
* `@clz`
* `@ctz`
* `@popCount`
* `@byteSwap`
* `@bitReverse`
* various encodings used by std
|
|
This reverts commit 6f0198cadbe29294f2bf3153a27beebd64377566.
|
|
This reverts commit 0c99ba1eab63865592bb084feb271cd4e4b0357e, reversing
changes made to 5f92b070bf284f1493b1b5d433dd3adde2f46727.
This caused a CI failure when it landed in master branch due to a
128-bit `@byteSwap` in std.mem.
|
|
|
|
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:
* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
|
|
|
|
* std.crypto: faster ghash and polyval on WebAssembly
Before: 91 MiB/s
After : 243 MiB/s
Some other platforms might benefit from this, but WebAssembly is
the obvious one (simd128 doesn't make a difference).
|
|
|
|
|
|
|
|
|
|
POLYVAL is GHASH's little brother, required by the AES-GCM-SIV
construction. It's defined in RFC8452.
The irreducible polynomial is a mirror of GHASH's (which doesn't
change anything in our implementation that didn't reverse the raw
bits to start with).
But most importantly, POLYVAL encodes byte strings as little-endian
instead of big-endian, which makes it a little bit faster on the
vast majority of modern CPUs.
So, both share the same code, just with comptime magic to use the
correct endianness and only double the key for GHASH.
|