aboutsummaryrefslogtreecommitdiff
path: root/lib/fuzzer.zig
AgeCommit message (Collapse)Author
2025-09-25fuzzing: fix off-by-one in limit countAndrew Kelley
2025-09-25implement review suggestionsLoris Cro
2025-09-24fuzzing: implement limited fuzzingLoris Cro
Adds the limit option to `--fuzz=[limit]`. the limit expresses a number of iterations that *each fuzz test* will perform at maximum before exiting. The limit argument supports also 'K', 'M', and 'G' suffixeds (e.g. '10K'). Does not imply `--web-ui` (like unlimited fuzzing does) and prints a fuzzing report at the end. Closes #22900 but does not implement the time based limit, as after internal discussions we concluded to be problematic to both implement and use correctly.
2025-09-18fuzzer: remove rodata load tracingKendall Condon
This can be re-evaluated at a later time, but at the moment the performance and stability concerns hold it back. Additionally, it promotes a non-smithing approach to fuzz tests.
2025-09-18greatly improve capabilities of the fuzzerKendall Condon
This PR significantly improves the capabilities of the fuzzer. The changes made to the fuzzer to accomplish this feat mostly include tracking memory reads from .rodata to determine fresh inputs, new mutations (especially the ones that insert const values from .rodata reads and __sanitizer_conv_const_cmp), and minimizing found inputs. Additionally, the runs per second has greatly been increased due to generating smaller inputs and avoiding clearing the 8-bit pc counters. An additional feature added is that the length of the input file is now stored and the old input file is rerun upon start. Other changes made to the fuzzer include more logical initialization, using one shared file `in` for inputs, creating corpus files with proper sizes, and using hexadecimal-numbered corpus files for simplicity. Furthermore, I added several new fuzz tests to gauge the fuzzer's efficiency. I also tried to add a test for zstandard decompression, which it crashed within 60,000 runs (less than a second.) Bug fixes include: * Fixed a race conditions when multiple fuzzer processes needed to use the same coverage file. * Web interface stats now update even when unique runs is not changing. * Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns since they are valid whitespace.
2025-08-30rework std.Io.Writer.Allocating to support runtime-known alignmentAndrew Kelley
Also, breaking API changes to: * std.fs.Dir.readFileAlloc * std.fs.Dir.readFileAllocOptions
2025-08-01build system: replace fuzzing UI with build UI, add time reportmlugg
This commit replaces the "fuzzer" UI, previously accessed with the `--fuzz` and `--port` flags, with a more interesting web UI which allows more interactions with the Zig build system. Most notably, it allows accessing the data emitted by a new "time report" system, which allows users to see which parts of Zig programs take the longest to compile. The option to expose the web UI is `--webui`. By default, it will listen on `[::1]` on a random port, but any IPv6 or IPv4 address can be specified with e.g. `--webui=[::1]:8000` or `--webui=127.0.0.1:8000`. The options `--fuzz` and `--time-report` both imply `--webui` if not given. Currently, `--webui` is incompatible with `--watch`; specifying both will cause `zig build` to exit with a fatal error. When the web UI is enabled, the build runner spawns the web server as soon as the configure phase completes. The frontend code consists of one HTML file, one JavaScript file, two CSS files, and a few Zig source files which are built into a WASM blob on-demand -- this is all very similar to the old fuzzer UI. Also inherited from the fuzzer UI is that the build system communicates with web clients over a WebSocket connection. When the build finishes, if `--webui` was passed (i.e. if the web server is running), the build runner does not terminate; it continues running to serve web requests, allowing interactive control of the build system. In the web interface is an overall "status" indicating whether a build is currently running, and also a list of all steps in this build. There are visual indicators (colors and spinners) for in-progress, succeeded, and failed steps. There is a "Rebuild" button which will cause the build system to reset the state of every step (note that this does not affect caching) and evaluate the step graph again. If `--time-report` is passed to `zig build`, a new section of the interface becomes visible, which associates every build step with a "time report". For most steps, this is just a simple "time taken" value. However, for `Compile` steps, the compiler communicates with the build system to provide it with much more interesting information: time taken for various pipeline phases, with a per-declaration and per-file breakdown, sorted by slowest declarations/files first. This feature is still in its early stages: the data can be a little tricky to understand, and there is no way to, for instance, sort by different properties, or filter to certain files. However, it has already given us some interesting statistics, and can be useful for spotting, for instance, particularly complex and slow compile-time logic. Additionally, if a compilation uses LLVM, its time report includes the "LLVM pass timing" information, which was previously accessible with the (now removed) `-ftime-report` compiler flag. To make time reports more useful, ZIR and compilation caches are ignored by the Zig compiler when they are enabled -- in other words, `Compile` steps *always* run, even if their result should be cached. This means that the flag can be used to analyze a project's compile time without having to repeatedly clear cache directory, for instance. However, when using `-fincremental`, updates other than the first will only show you the statistics for what changed on that particular update. Notably, this gives us a fairly nice way to see exactly which declarations were re-analyzed by an incremental update. If `--fuzz` is passed to `zig build`, another section of the web interface becomes visible, this time exposing the fuzzer. This is quite similar to the fuzzer UI this commit replaces, with only a few cosmetic tweaks. The interface is closer than before to supporting multiple fuzz steps at a time (in line with the overall strategy for this build UI, the goal will be for all of the fuzz steps to be accessible in the same interface), but still doesn't actually support it. The fuzzer UI looks quite different under the hood: as a result, various bugs are fixed, although other bugs remain. For instance, viewing the source code of any file other than the root of the main module is completely broken (as on master) due to some bogus file-to-module assignment logic in the fuzzer UI. Implementation notes: * The `lib/build-web/` directory holds the client side of the web UI. * The general server logic is in `std.Build.WebServer`. * Fuzzing-specific logic is in `std.Build.Fuzz`. * `std.Build.abi` is the new home of `std.Build.Fuzz.abi`, since it now relates to the build system web UI in general. * The build runner now has an **actual** general-purpose allocator, because thanks to `--watch` and `--webui`, the process can be arbitrarily long-lived. The gpa is `std.heap.DebugAllocator`, but the arena remains backed by `std.heap.page_allocator` for efficiency. I fixed several crashes caused by conflation of `gpa` and `arena` in the build runner and `std.Build`, but there may still be some I have missed. * The I/O logic in `std.Build.WebServer` is pretty gnarly; there are a *lot* of threads involved. I anticipate this situation improving significantly once the `std.Io` interface (with concurrency support) is introduced.
2025-07-07update standalone and incremental tests to new APIAndrew Kelley
2025-04-30use correcct symbol for the end of pcguard sectionDongjia Zhang
2025-04-26fuzz: fix expected section start/end symbol name on MacOS when linking libfuzzertjog
Not only is the section name when adding the sancov variables different. The linker symbol ending up in the binary is also different. Reference: https://github.com/llvm/llvm-project/blob/60105ac6bab130c2694fc7f5b7b6a5fddaaab752/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp#L1076-L1104
2025-03-05Remove uses of deprecated callconv aliasesLinus Groh
2025-02-21fix `-fsanitize-coverage-trace-pc-guard` and fuzzer support for C compile unitsXavier Bouchoux
- allow `-fsanitize-coverage-trace-pc-guard` to be used on its own without enabling the fuzzer. (note that previouly, while the flag was only active when fuzzing, the fuzzer itself doesn't use it, and the code will not link as is.) - add stub functions in the fuzzer to link with instrumented C code (previously fuzzed tests failed to link if they were calling into C): while the zig compile unit uses a custom `EmitOptions.Coverage` with features disabled, the C code is built calling into the clang driver with "-fsanitize=fuzzer-no-link" that automatically enables the default features. (see https://github.com/llvm/llvm-project/blob/de06978ebcff5f75913067b019d2d522d0be0872/clang/lib/Driver/SanitizerArgs.cpp#L587) - emit `-fsanitize-coverage=trace-pc-guard` instead of `-Xclang -fsanitize-coverage-trace-pc-guard` so that edge coverrage is enabled by clang driver. (previously, it was enabled only because the fuzzer was)
2025-02-11fuzzer: write inputs to shared memory before runningAndrew Kelley
breaking change to the fuzz testing API; it now passes a type-safe context parameter to the fuzz function. libfuzzer is reworked to select inputs from the entire corpus. I tested that it's roughly as good as it was before in that it can find the panics in the simple examples, as well as achieve decent coverage on the tokenizer fuzz test. however I think the next step here will be figuring out why so many points of interest are missing from the tokenizer in both Debug and ReleaseSafe modes. does not quite close #20803 yet since there are some more important things to be done, such as opening the previous corpus, continuing fuzzing after finding bugs, storing the length of the inputs, etc.
2025-02-06adjust runtime page size APIsAndrew Kelley
* fix merge conflicts * rename the declarations * reword documentation * extract FixedBufferAllocator to separate file * take advantage of locals * remove the assertion about max alignment in Allocator API, leaving it Allocator implementation defined * fix non-inline function call in start logic The GeneralPurposeAllocator implementation is totally broken because it uses global state but I didn't address that in this commit.
2025-02-06runtime page size detectionArchbirdplus
heap.zig: define new default page sizes heap.zig: add min/max_page_size and their options lib/std/c: add miscellaneous declarations heap.zig: add pageSize() and its options switch to new page sizes, especially in GPA/stdlib mem.zig: remove page_size
2024-11-05fix type of std_optionsJonathan Hallstrom
2024-09-12Replace deprecated default initializations with decl literalsLinus Groh
2024-09-11make lowest stack an internal libfuzzer detailAndrew Kelley
This value is useful to help determine run uniqueness in the face of recursion, however it is not valuable to expose to the fuzzing UI.
2024-09-11libfuzzer: use a function pointer instead of externAndrew Kelley
solves the problem presented in the previous commit message
2024-09-11rework fuzzing APIAndrew Kelley
The previous API used `std.testing.fuzzInput(.{})` however that has the problem that users call it multiple times incorrectly, and there might be work happening to obtain the corpus which should not be included in coverage analysis, and which must not slow down iteration speed. This commit restructures it so that the main loop lives in libfuzzer and directly calls the "test one" function. In this commit I was a little too aggressive because I made the test runner export `fuzzer_one` for this purpose. This was motivated by performance, but it causes "exported symbol collision: fuzzer_one" to occur when more than one fuzz test is provided. There are three ways to solve this: 1. libfuzzer needs to be passed a function pointer instead. Possible performance downside. 2. build runner needs to build a different process per fuzz test. Potentially wasteful and unclear how to isolate them. 3. test runner needs to perform a relocation at runtime to point the function call to the relevant unit test. Portability issues and dubious performance gains.
2024-08-28implement code coverage instrumentation manuallyAndrew Kelley
instead of relying on the LLVM sancov pass. The LLVM pass is still executed if trace_pc_guard is requested, disabled otherwise. The LLVM backend emits the instrumentation directly. It uses `__sancov_pcs1` symbol name instead of `__sancov_pcs` because each element is 1 usize instead of 2. AIR: add CoveragePoint to branch hints which indicates whether those branches are interesting for code coverage purposes. Update libfuzzer to use the new instrumentation. It's simplified since we no longer need the constructor and the pcs are now in a continguous list. This is a regression in the fuzzing functionality because the instrumentation for comparisons is no longer emitted, resulting in worse fuzzer inputs generated. A future commit will add that instrumentation back.
2024-08-28std: update `std.builtin.Type` fields to follow naming conventionsmlugg
The compiler actually doesn't need any functional changes for this: Sema does reification based on the tag indices of `std.builtin.Type` already! So, no zig1.wasm update is necessary. This change is necessary to disallow name clashes between fields and decls on a type, which is a prerequisite of #9938.
2024-08-08more optimized and correct management of 8-bit PC countersAndrew Kelley
* Upgrade from u8 to usize element types. - WebAssembly assumes u64. It should probably try to be target-aware instead. * Move the covered PC bits to after the header so it goes on the same page with the other rapidly changing memory (the header stats). depends on the semantics of accepted proposal #19755 closes #20994
2024-08-08fuzzing: comptime assertions to protect the ABIAndrew Kelley
compile errors are nice
2024-08-07libfuzzer: fix looking at wrong memory for pc countersAndrew Kelley
this fix bypasses the slice bounds, reading garbage data for up to the last 7 bits (which are technically supposed to be ignored). that's going to need to be fixed, let's fix that along with switching from byte elems to usize elems.
2024-08-07fuzzer web UI: receive coverage informationAndrew Kelley
* libfuzzer: track unique runs instead of deduplicated runs - easier for consumers to notice when to recheck the covered bits. * move common definitions to `std.Build.Fuzz.abi`. build runner sends all the information needed to fuzzer web interface client needed in order to display inline coverage information along with source code.
2024-08-07fuzzing: progress towards web UIAndrew Kelley
* libfuzzer: close file after mmap * fuzzer/main.js: connect with EventSource and debug dump the messages. currently this prints how many fuzzer runs have been attempted to console.log. * extract some `std.debug.Info` logic into `std.debug.Coverage`. Prepares for consolidation across multiple different executables which share source files, and makes it possible to send all the PC/SourceLocation mapping data with 4 memcpy'd arrays. * std.Build.Fuzz: - spawn a thread to watch the message queue and signal event subscribers. - track coverage map data - respond to /events URL with EventSource messages on a timer
2024-08-07introduce a web interface for fuzzingAndrew Kelley
* new .zig-cache subdirectory: 'v' - stores coverage information with filename of hash of PCs that want coverage. This hash is a hex encoding of the 64-bit coverage ID. * build runner * fixed bug in file system inputs when a compile step has an overridden zig_lib_dir field set. * set some std lib options optimized for the build runner - no side channel mitigations - no Transport Layer Security - no crypto fork safety * add a --port CLI arg for choosing the port the fuzzing web interface listens on. it defaults to choosing a random open port. * introduce a web server, and serve a basic single page application - shares wasm code with autodocs - assets are created live on request, for convenient development experience. main.wasm is properly cached if nothing changes. - sources.tar comes from file system inputs (introduced with the `--watch` feature) * receives coverage ID from test runner and sends it on a thread-safe queue to the WebServer. * test runner - takes a zig cache directory argument now, for where to put coverage information. - sends coverage ID to parent process * fuzzer - puts its logs (in debug mode) in .zig-cache/tmp/libfuzzer.log - computes coverage_id and makes it available with `fuzzer_coverage_id` exported function. - the memory-mapped coverage file is now namespaced by the coverage id in hex encoding, in `.zig-cache/v` * tokenizer - add a fuzz test to check that several properties are upheld
2024-08-07fuzzer: log errors and move deduplicated runs to shared memAndrew Kelley
2024-08-07fuzzer: track code coverage from all runsAndrew Kelley
When a unique run is encountered, track it in a bit set memory-mapped into the fuzz directory so it can be observed by other processes, even while the fuzzer is running.
2024-07-25fuzzer: use the cmp valuesAndrew Kelley
seems to provide better scoring
2024-07-25fuzzer: basic implementationAndrew Kelley
just some experimentation. I didn't expect this to be effective so quickly but it already can find a comparison made with mem.eql
2024-07-25add --debug-rt CLI arg to the compiler + bonus editsAndrew Kelley
The flag makes compiler_rt and libfuzzer be in debug mode. Also: * fuzzer: override debug logs and disable debug logs for frequently called functions * std.Build.Fuzz: fix bug of rerunning the old unit test binary * report errors from rebuilding the unit tests better * link.Elf: additionally add tsan lib and fuzzer lib to the hash
2024-07-25implement std.testing.fuzzInputAndrew Kelley
For now this returns a dummy fuzz input.
2024-07-22libfuzzer: log all the libcalls to stderrAndrew Kelley
2024-07-22libfuzzer: implement enough symbols for hello worldAndrew Kelley
2024-07-22initial support for integrated fuzzingAndrew Kelley
* Add the `-ffuzz` and `-fno-fuzz` CLI arguments. * Detect fuzz testing flags from zig cc. * Set the correct clang flags when fuzz testing is requested. It can be combined with TSAN and UBSAN. * Compilation: build fuzzer library when needed which is currently an empty zig file. * Add optforfuzzing to every function in the llvm backend for modules that have requested fuzzing. * In ZigLLVMTargetMachineEmitToFile, add the optimization passes for sanitizer coverage. * std.mem.eql uses a naive implementation optimized for fuzzing when builtin.fuzz is true. Tracked by #20702