| Age | Commit message (Collapse) | Author |
|
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.
After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.
Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.
I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
|
|
|
|
We can infer the framework name from the included resolved framework
path.
Fix hash implementation, and bump linker hash value from 9 to 10.
|
|
|
|
closes #16347
|
|
wasm-linker: finish shared-memory & TLS implementation
|
|
Previously, they were only created when we had any TLS segment.
This meant that while the symbol existed, the global itself wouldn't.
The result of this was a crash during symbol names writing as it
would attempt to write the symbol name of a global that didn't exist.
Now we always create them, and instead update its `init` value during
`setupMemory`.
In the future, the entire symbol (and global) will be removed by
the garbage collector.
|
|
Implements the `start` section which will execute a given function
at startup of the program. After function execution, the _start
function will be called by the runtime. In the case of shared-memory
we set this section to the function `__wasm_init_memory` which will
initialize all memory on startup.
This also fixes the above mentioned function to ensure we correctly
lower the i32 values.
Lastly, this fixes a typo where we would retrieve a global, instead
of setting its value.
|
|
|
|
Rather than verifying if importing memory is false, we now rely
on the option that was passed to the CLI (where export is defaulted
to `true` unless only import-memory is given).
|
|
Abridged summary:
* Move `Module.Fn` into `InternPool`.
* Delete a lot of confusing and problematic `Sema` logic related to
generic function calls.
This commit removes `Module.Fn` and replaces it with two new
`InternPool.Tag` values:
* `func_decl` - corresponding to a function declared in the source
code. This one contains line/column numbers, zir_body_inst, etc.
* `func_instance` - one for each monomorphization of a generic
function. Contains a reference to the `func_decl` from whence the
instantiation came, along with the `comptime` parameter values (or
types in the case of `anytype`)
Since `InternPool` provides deduplication on these values, these fields
are now deleted from `Module`:
* `monomorphed_func_keys`
* `monomorphed_funcs`
* `align_stack_fns`
Instead of these, Sema logic for generic function instantiation now
unconditionally evaluates the function prototype expression for every
generic callsite. This is technically required in order for type
coercions to work. The previous code had some dubious, probably wrong
hacks to make things work, such as `hashUncoerced`. I'm not 100% sure
how we were able to eliminate that function and still pass all the
behavior tests, but I'm pretty sure things were still broken without
doing type coercion for every generic function call argument.
After the function prototype is evaluated, it produces a deduplicated
`func_instance` `InternPool.Index` which can then be used for the
generic function call.
Some other nice things made by this simplification are the removal of
`comptime_args_fn_inst` and `preallocated_new_func` from `Sema`, and the
messy logic associated with them.
I have not yet been able to measure the perf of this against master
branch. On one hand, it reduces memory usage and pointer chasing of the
most heavily used `InternPool` Tag - function bodies - but on the other
hand, it does evaluate function prototype expressions more than before.
We will soon find out.
|
|
|
|
This flag allows the user to force export the memory to the host
environment. This is useful when the memory is imported from the
host but must also be exported. This is (currently) required
to pass the memory validation for runtimes when using threads.
In this future this may become an error instead.
|
|
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:
* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
|
|
|
|
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
|
|
Anecdote 1: The generic version is way more popular than the non-generic
one in Zig codebase:
git grep -w alignForward | wc -l
56
git grep -w alignForwardGeneric | wc -l
149
git grep -w alignBackward | wc -l
6
git grep -w alignBackwardGeneric | wc -l
15
Anecdote 2: In my project (turbonss) that does much arithmetic and
alignment I exclusively use the Generic functions.
Anecdote 3: we used only the Generic versions in the Macho Man's linker
workshop.
|
|
wasm/linker: symbol resolution improvements
|
|
|
|
We now resolve undefined symbols during incremental-compilation
where we discard the current symbol if we detect we found
an existing symbol which is not the one currently being updated.
The symbol will always be discarded in favor of the existing symbol
in such a case.
|
|
When compiling Zig code using the Wasm backend, we would previously
incorrectly resolve exported symbols as it would not correctly remove
existing symbols if they were to be overwritten. This meant that
undefined symbols could cause collisions although they should be
resolved by the exported symbol.
|
|
These are frequently invalidated whenever a string is interned, so avoid
creating pointers to `string_bytes` wherever possible. This is an
attempt to fix random CI failures.
|
|
This avoids having dangling pointers into `InternPool.string_bytes`.
|
|
The main motivation for this commit is eliminating Decl.value_arena.
Everything else is dominoes.
Decl.name used to be stored in the GPA, now it is stored in InternPool.
It ended up being simpler to migrate other strings to be interned as
well, such as struct field names, union field names, and a few others.
This ended up requiring a big diff, sorry about that. But the changes
are pretty nice, we finally start to take advantage of InternPool's
existence.
global_error_set and error_name_list are simplified. Now it is a single
ArrayHashMap(NullTerminatedString, void) and the index is the error tag
value.
Module.tmp_hack_arena is re-introduced (it was removed in
eeff407941560ce8eb5b737b2436dfa93cfd3a0c) in order to deal with
comptime_args, optimized_order, and struct and union fields. After
structs and unions get moved into InternPool properly, tmp_hack_arena
can be deleted again.
|
|
This makes the difference between `decl.getOwnedFunction` and
`decl.val.getFunction` more clear when reading the code.
|
|
|
|
I'm seeing a new assertion trip: the call to `enumTagFieldIndex` in the
implementation of `@Type` is attempting to query the field index of an
union's enum tag, but the type of the enum tag value provided is not the
same as the union's tag type. Most likely this is a problem with type
coercion, since values are now typed.
Another problem is that I added some hacks in std.builtin because I
didn't see any convenient way to access them from Sema. That should
definitely be cleaned up before merging this branch.
|
|
|
|
Notably, `vector`.
Additionally, all alternate encodings of `pointer`, `optional`, and
`array`.
|
|
Instead of doing everything at once which is a hopelessly large task,
this introduces a piecemeal transition that can be done in small
increments at a time.
This is a minimal changeset that keeps the compiler compiling. It only
uses the InternPool for a small set of types.
Behavior tests are not passing.
Air.Inst.Ref and Zir.Inst.Ref are separated into different enums but
compile-time verified to have the same fields in the same order.
The large set of changes is mainly to deal with the fact that most Type
and Value methods now require a Module to be passed in, so that the
InternPool object can be accessed.
|
|
`-l :path/to/lib.so` behavior on gcc/clang is:
- the path is recorded as-is: no paths, exact filename (`libX.so.Y`).
- no rpaths.
The previous version removed the `:` and pretended it's a positional
argument to the linker. That works in almost all cases, except in how
rules_go[1] does things (the Bazel wrapper for Go).
Test case in #15743, output:
gcc rpath:
0x0000000000000001 (NEEDED) Shared library: [libversioned.so.2]
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/x]
gcc plain:
0x0000000000000001 (NEEDED) Shared library: [libversioned.so.2]
zig cc rpath:
0x0000000000000001 (NEEDED) Shared library: [libversioned.so.2]
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/x]
zig cc plain:
0x0000000000000001 (NEEDED) Shared library: [libversioned.so.2]
Fixes #15743
[1]: https://github.com/bazelbuild/rules_go
|
|
|
|
* build.zig: the result of b.option() can be assigned directly in many
cases thanks to the return type being an optional
* std.Build: make the build system aware of the
std.Build.Step.Compile.BuildId type when used as an option.
- remove extraneous newlines in error logs
* simplify caching logic
* simplify hexstring parsing tests and use a doc test
* simplify hashing logic. don't use an optional when the `none` tag
already provides this meaning.
* CLI: fix incorrect linker arg parsing
|
|
|
|
|
|
|
|
Rather than using a function call to verify if an error fits within
the global error set's length, we now store the error set' size in
the .rodata segment of the linear memory and load that value onto
the stack to check with the integer value.
|
|
Creates a global undefined symbol when this instruction is called.
The linker will then resolve it as a lazy symbol, ensuring it is
only generated when the symbol was created. In `flush` it will then
generate the function as only then, all errors are known and we can
generate the function body. This logic allows us to re-use the same
functionality of linker-synthetic-functions.
|
|
|
|
When we lower the instruction for `@tagName` we generate a new
function if it doesn't exist yet for that decl. This function creates
an if-else chain to determine which value was provided. Each tag
generates a constant that contains its name as the value. For each
tag we generate a case where the pointer of the string is stored in
the result slice. The length of the tagname is comptime-known, therefore
will be stored in the slice directly without having it being part of
the tagname symbol. In the future this can use a jump table instead
of an if-else chain, similar to the `switch` instruction.
|
|
Also clean up parsing of linker args - reuse `ArgsIterator`.
In MachO, ensure we add every symbol marked with `-u` as undefined
before proceeding with symbol resolution. Additionally, ensure those
symbols are never garbage collected.
MachO entry_in_dylib test: pass `-u _my_main` when linking executable
so that it is not incorrectly garbage collected by the linker.
|
|
|
|
wasm-linker: Implement shared-memory
|
|
Implements the __wasm_init_memory and __wasm_init_memory_flag synthetic
function and symbol.
The former will initialize all passive segments during runtime. For the
bss section we will fill it with zeroes, whereas the other segments
will simply be initialized only.
The latter stores the offset into the linear data section, after all
heap memory that is part of the Wasm module. Any memory initialized
at runtime starts from this offset.
|
|
This adds the atomic opcodes for the Threads proposal to the
WebAssembly specification: https://github.com/WebAssembly/threads
PrefixedOpcode has been renamed to MiscOpcode as there's multiple
types of prefixed opcodes. This naming is similar to other tools
such as LLVM. As we now use the 0xFE prefix, we moved the
function_index MIR instruction as it was occupying the same value.
This commit includes renaming all related opcodes.
|
|
|
|
Implements the TLS initialization function. This is a synthetic function
created by the linker. This will only be created when shared-memory is
enabled. This function will be called during thread creation, if there's
any TLS symbols, which will initialize the TLS segment using the
bulk-memory feature.
|
|
Initialize TLS symbols when shared-memory is enabled. Those symbols
will be called by synthetic functions created by the linker. (TODO).
|
|
When linking with shared-memory enabled, we must ensure to emit
the "data count" section as well as emit the correct segment flags
to tell the runtime/loader that each segment is passive. This is
required as we don't emit the offsets for such segments but instead
initialize each segment (for each thread) during runtime.
|
|
When the user enables shared-memory, we must ensure the linked objects
have the 'atomics' and 'bulk-memory' features allowed.
|