| Age | Commit message (Collapse) | Author |
|
The goals of this branch are to:
* compile faster when using the wasm linker and backend
* enable saving compiler state by directly copying in-memory linker
state to disk.
* more efficient compiler memory utilization
* introduce integer type safety to wasm linker code
* generate better WebAssembly code
* fully participate in incremental compilation
* do as much work as possible outside of flush(), while continuing to do
linker garbage collection.
* avoid unnecessary heap allocations
* avoid unnecessary indirect function calls
In order to accomplish this goals, this removes the ZigObject
abstraction, as well as Symbol and Atom. These abstractions resulted
in overly generic code, doing unnecessary work, and needless
complications that simply go away by creating a better in-memory data
model and emitting more things lazily.
For example, this makes wasm codegen emit MIR which is then lowered to
wasm code during linking, with optimal function indexes etc, or
relocations are emitted if outputting an object. Previously, this would
always emit relocations, which are fully unnecessary when emitting an
executable, and required all function calls to use the maximum size LEB
encoding.
This branch introduces the concept of the "prelink" phase which occurs
after all object files have been parsed, but before any Zcu updates are
sent to the linker. This allows the linker to fully parse all objects
into a compact memory model, which is guaranteed to be complete when Zcu
code is generated.
This commit is not a complete implementation of all these goals; it is
not even passing semantic analysis.
|
|
Before, the wasm struct had a string table, the ZigObject had a string
table, and each Object had a string table.
Now there is just the one. This makes for more efficient use of memory
and simplifies logic, particularly with regards to linker state
serialization.
This commit additionally adds significantly more integer type safety.
|
|
Removes the `files` field from the Wasm linker, storing the ZigObject
as its own field instead using a tagged union.
This removes a layer of indirection when accessing the ZigObject, and
untangles logic so that we can introduce a "pre-link" phase that
prepares the linker state to handle only incremental updates to the
ZigObject and then minimize logic inside flush().
Furthermore, don't make array elements store their own indexes, that's
always a waste.
Flattens some of the file system hierarchy and unifies variable names
for easier refactoring.
Introduces type safety for optional object indexes.
|
|
|
|
This introduces some type safety so we cannot accidently give an atom
index as a symbol index. This also means we do not have to store any
optionals and therefore allow for memory optimizations. Lastly, we can
now always simply access the symbol index of an atom, rather than having
to call `getSymbolIndex` as it is easy to forget.
|
|
When multiple symbols point to the same function, we ensure any
other symbol other than the original will be discarded and point
to the original instead. This prevents emitting the same function
code more than once.
|
|
Symbols which are exported to the host, or contain the `NO_STRIP`
flag, will be marked. All symbols which are referenced by this symbol
are marked likewise. We achieve this by parsing all relocations of a
symbol, and then marking the symbol it points to within the relocation.
|
|
Previously the symbol tag field would remain `undefined` until it
was set during `flush`. However, the symbol's tag would be observed
earlier than where it was being set. We now set it to the explicit
tag `undefined` so this can be caught during debug. The symbol tag of
a decl will now also be set right after `updateDecl` and `updateFunc`.
Likewise, we now also set the `name` field during atom creation for
decls, as well as set the other fields to the max(u32) to ensure we
get a compiler crash during debug to ensure any misses will be caught.
|
|
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
|
|
Linker now parses segments with regards to TLS segments. If the name
represents a TLS segment but does not contain the TLS flag, we set it
manually as the object file is created using an older compiler (LLVM).
For now we panic when we find a TLS relocation and implement those
later.
|
|
For data symbols we will now store its virtual address. This means
we do no longer have to calculate it each time a relocation asks
for the address. This is now done for all data symbols only once
rather than every single relocation for that symbol.
This now also allows us directly store the virtual address of synthetic
symbols without having to create an atom for them. This means we also
don't need to have a "synthetic" segment any longer and do not emit
the synthetic symbols such as __heap_end and __heap_base into the final
binary.
|
|
Adds support for both the `-rdynamic` and the `--export=<value>`
flags. Support is added to both the incremental linker as well as
the traditional linker (zld).
|
|
When a data symbol is required to be exported, we instead generate
a global that will be exported. This global is immutable and contains
the address of the data symbol.
|
|
|
|
Generate symbols for extern variables and try to resolve them.
Unresolved 'data' symbols generate an error as they cannot be
exported from the Wasm runtime into a Wasm module. This means,
they can only be resolved by other object files such as from other
Zig or C code compiled to Wasm.
|
|
When a new symbol is resolved to an existing symbol where
it doesn't overwrite the existing symbol, we now add this symbol
to the discarded list. This is required so when any relocation points
to the symbol, we can retrieve the correct symbol it's resolved by instead.
|
|
For all symbols read from object files as well as generated from Zig code
will now be interned and have their offset into the string table saved on the `Symbol` instead.
Besides interning, local symbols now also use a decl's fully qualified name.
When a decl/symbol is extern/to-be-imported, the name of the decl itself will be used for symbol resolving.
Similarly for symbols that will be exported, will have their 'export name' set.
|
|
We now correctly implement exporting decls. This means it is possible to export
a decl with a different name than the decl that is doing the export.
This also sets the symbols with the correct flags, so when we emit a relocatable
object file, a linker can correctly resolve symbols and/or export the symbol to the host environment.
This commit also includes fixes to ensure relocations have the correct offset to how other
linkers will expect the offset, rather than what we use internally.
Other linkers accept the offset, relative to the section.
Internally we use an offset relative to the atom.
|
|
We now correctly allocate and create atoms for symbols from other object files.
Imports are now also resolved and appended when required.
Besides those changes, we now duplicate all symbol names, so we can correctly
generate unique names for unnamed constants.
TODO: String interning
|
|
This implements the merging of all sections, to generate a valid wasm binary where all symbols
have been resolved and their respective sections have been merged into the final binary.
|
|
This updates the test runner for stage2 to emit to stdout with the passed, skipped and failed tests
similar to the LLVM backend.
Another change to this is the start function, as it's now more in line with stage1's.
The stage2 test infrastructure for wasm/wasi has been updated to reflect this as well.
|
|
The linker will now emit names for all function, global and data segment symbols.
This increases the ability to debug wasm modules tremendously as tools like wasm2wat
can use this information to generate named functions, globals etc, rather than placeholders such as $f1.
|
|
- Converts previous `DeclBlock` into `Atom`'s to also make them compatible when
the rest of zlwd gets upstreamed and we can link with other object files.
- Resolves function signatures and removes any duplicates, saving us a lot of
potential bytes for larger projects.
- We now create symbols for each decl of the respective type
- We can now (but not implemented yet) perform proper relocations.
- Having symbols and segment_info allows us to create an object file
for wasm.
|