diff options
| author | Henry John Kupty <hkupty@users.noreply.github.com> | 2025-10-07 18:32:13 +0200 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-10-07 09:32:13 -0700 |
| commit | 163ebe044b76ada70b2bee2e17b9f3e948d54754 (patch) | |
| tree | 2cf3358476275fab5283e835452565e61dd14ce6 /lib/std/Io | |
| parent | 9760068826e01e5540da9168d2f02e15957a99cc (diff) | |
| download | zig-163ebe044b76ada70b2bee2e17b9f3e948d54754.tar.gz zig-163ebe044b76ada70b2bee2e17b9f3e948d54754.zip | |
std.mem.countScalar: rework to benefit from simd (#25477)
`findScalarPos` might do repetitive work, even if using simd. For
example, when searching the string `/abcde/fghijk/lm` for the character
`/`, a 16-byte wide search would yield `1000001000000100` but would only
count the first `1` and re-search the remaining of the string.
When testing locally, the difference was quite significative:
```
count scalar
5737 iterations 522.83us per iterations
0 bytes per iteration
worst: 2370us median: 512us stddev: 107.64us
count v2
38333 iterations 78.03us per iterations
0 bytes per iteration
worst: 713us median: 76us stddev: 10.62us
count scalar v2
99565 iterations 29.80us per iterations
0 bytes per iteration
worst: 41us median: 29us stddev: 1.04us
```
Note that `count v2` is a simpler string search, similar to the
remaining version of the simd approach:
```
pub fn countV2(comptime T: type, haystack: []const T, needle: T) usize {
const n = haystack.len;
if (n < 1) return 0;
var count: usize = 0;
for (haystack[0..n]) |item| {
count += @intFromBool(item == needle);
}
return count;
}
```
Which implies the compiler yields some optimized code for a simpler loop
that is more performant than the `findScalarPos`-based approach, hence
the usage of iterative approach for the remaining of the haystack.
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
Diffstat (limited to 'lib/std/Io')
0 files changed, 0 insertions, 0 deletions
