| Age | Commit message (Collapse) | Author |
|
|
|
|
|
Still, the branch is not yet passing semantic analysis.
|
|
|
|
|
|
mainly, rework how relocations works. This is the point at which symbol
indexes are known - not before. And don't emit unnecessary relocations!
They're only needed when emitting an object file.
Changes wasm linker to keep MIR around long-lived so that fixups can be
reapplied after linker garbage collection.
use labeled switch while we're at it
|
|
|
|
Makes linker functions have small error sets, required to report
diagnostics properly rather than having a massive error set that has a
lot of codes.
Other linker implementations are not ported yet.
Also the branch is not passing semantic analysis yet.
|
|
The goals of this branch are to:
* compile faster when using the wasm linker and backend
* enable saving compiler state by directly copying in-memory linker
state to disk.
* more efficient compiler memory utilization
* introduce integer type safety to wasm linker code
* generate better WebAssembly code
* fully participate in incremental compilation
* do as much work as possible outside of flush(), while continuing to do
linker garbage collection.
* avoid unnecessary heap allocations
* avoid unnecessary indirect function calls
In order to accomplish this goals, this removes the ZigObject
abstraction, as well as Symbol and Atom. These abstractions resulted
in overly generic code, doing unnecessary work, and needless
complications that simply go away by creating a better in-memory data
model and emitting more things lazily.
For example, this makes wasm codegen emit MIR which is then lowered to
wasm code during linking, with optimal function indexes etc, or
relocations are emitted if outputting an object. Previously, this would
always emit relocations, which are fully unnecessary when emitting an
executable, and required all function calls to use the maximum size LEB
encoding.
This branch introduces the concept of the "prelink" phase which occurs
after all object files have been parsed, but before any Zcu updates are
sent to the linker. This allows the linker to fully parse all objects
into a compact memory model, which is guaranteed to be complete when Zcu
code is generated.
This commit is not a complete implementation of all these goals; it is
not even passing semantic analysis.
|
|
|
|
|
|
to help understand where a spurious failure is occurring
|
|
Don't use the reader interface
Avoid unnecessary heap allocations
At first I started working on incorporating the Archive fields into the
Wasm data model, however, I realized a better strategy: simply omit
Archive data from the serialized linker state. Those files can be
trivially reparsed on next compiler process start. If they haven't
changed, great. Otherwise if they have, the prelink phase needs to be
restarted anyway.
|
|
Before, the wasm struct had a string table, the ZigObject had a string
table, and each Object had a string table.
Now there is just the one. This makes for more efficient use of memory
and simplifies logic, particularly with regards to linker state
serialization.
This commit additionally adds significantly more integer type safety.
|
|
Primarily, this moves linker input parsing from flush() into the linker
task queue, which is executed simultaneously with the frontend.
I also made it avoid redundantly opening the same archive file N times
for each object file inside. Furthermore, hard code fixed buffer stream
rather than using a generic stream type.
Finally, I fixed the error handling of the Wasm.Archive.parse function.
Please pay attention to this pattern of returning a struct rather than
accepting a mutable struct as an argument. This ensures function-level
atomicity and makes resource management straightforward.
Deletes the file and path fields from Archive and Object.
Removed a well-meaning but ultimately misguided suggestion about how to
think about ZigObject since thinking about it that way has led to
problematic anti-DOD patterns.
|
|
Removes the `files` field from the Wasm linker, storing the ZigObject
as its own field instead using a tagged union.
This removes a layer of indirection when accessing the ZigObject, and
untangles logic so that we can introduce a "pre-link" phase that
prepares the linker state to handle only incremental updates to the
ZigObject and then minimize logic inside flush().
Furthermore, don't make array elements store their own indexes, that's
always a waste.
Flattens some of the file system hierarchy and unifies variable names
for easier refactoring.
Introduces type safety for optional object indexes.
|
|
|
|
* Compilation.objects changes to Compilation.link_inputs which stores
objects, archives, windows resources, shared objects, and strings
intended to be put directly into the dynamic section. Order is now
preserved between all of these kinds of linker inputs. If it is
determined the order does not matter for a particular kind of linker
input, that item should be moved to a different array.
* rename system_libs to windows_libs
* untangle library lookup from CLI types
* when doing library lookup, instead of using access syscalls, go ahead
and open the files and keep the handles around for passing to the
cache system and the linker.
* during library lookup and cache file hashing, use positioned reads to
avoid affecting the file seek position.
* library directories are opened in the CLI and converted to Directory
objects, warnings emitted for those that cannot be opened.
|
|
By organizing linker diagnostics into this struct, it becomes possible
to share more code between linker backends, and more importantly it
becomes possible to pass only the Diag struct to some functions, rather
than passing the entire linker state object in. This makes data
dependencies more obvious, making it easier to rearrange code and to
multithread.
Also fix MachO code abusing an atomic variable. Not only was it using
the wrong atomic operation, it is unnecessary additional state since
the state is already being protected by a mutex.
|
|
Embrace the Path abstraction, doing more operations based on directory
handles rather than absolute file paths. Most of the diff noise here
comes from this one.
Fix sorting of crtbegin/crtend atoms. Previously it would look at all
path components for those strings.
Make the C runtime path detection partially a pure function, and move
some logic to glibc.zig where it belongs.
|
|
The goal is to minimize as much as possible how much logic is inside
flush(). So let's start by moving out obvious stuff. This data can be
preformatted before flush().
|
|
|
|
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.
This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
|
|
|
|
This type is exactly the same as std.Build.Cache.Path, except for
one function which is not used anymore. Therefore we can replace
it without consequences.
|
|
The type `Zcu.Decl` in the compiler is problematic: over time it has
gained many responsibilities. Every source declaration, container type,
generic instantiation, and `@extern` has a `Decl`. The functions of
these `Decl`s are in some cases entirely disjoint.
After careful analysis, I determined that the two main responsibilities
of `Decl` are as follows:
* A `Decl` acts as the "subject" of semantic analysis at comptime. A
single unit of analysis is either a runtime function body, or a
`Decl`. It registers incremental dependencies, tracks analysis errors,
etc.
* A `Decl` acts as a "global variable": a pointer to it is consistent,
and it may be lowered to a specific symbol by the codegen backend.
This commit eliminates `Decl` and introduces new types to model these
responsibilities: `Cau` (Comptime Analysis Unit) and `Nav` (Named
Addressable Value).
Every source declaration, and every container type requiring resolution
(so *not* including `opaque`), has a `Cau`. For a source declaration,
this `Cau` performs the resolution of its value. (When #131 is
implemented, it is unsolved whether type and value resolution will share
a `Cau` or have two distinct `Cau`s.) For a type, this `Cau` is the
context in which type resolution occurs.
Every non-`comptime` source declaration, every generic instantiation,
and every distinct `extern` has a `Nav`. These are sent to codegen/link:
the backends by definition do not care about `Cau`s.
This commit has some minor technically-breaking changes surrounding
`usingnamespace`. I don't think they'll impact anyone, since the changes
are fixes around semantics which were previously inconsistent (the
behavior changed depending on hashmap iteration order!).
Aside from that, this changeset has no significant user-facing changes.
Instead, it is an internal refactor which makes it easier to correctly
model the responsibilities of different objects, particularly regarding
incremental compilation. The performance impact should be negligible,
but I will take measurements before merging this work into `master`.
Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com>
Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>
|
|
Enable it by default when building Zig code in release modes.
Contributes to #9432.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This change modifies `Zcu.ErrorMsg` to store a `Zcu.LazySrcLoc` rather
than a `Zcu.SrcLoc`. Everything else is dominoes.
The reason for this change is incremental compilation. If a failed
`AnalUnit` is up-to-date on an update, we want to re-use the old error
messages. However, the file containing the error location may have been
modified, and `SrcLoc` cannot survive such a modification. `LazySrcLoc`
is designed to be correct across incremental updates. Therefore, we
defer source location resolution until `Compilation` gathers the compile
errors into the `ErrorBundle`.
|
|
This commit reworks our representation of exported Decls and values in
Zcu to be memory-optimized and trivially serialized.
All exports are now stored in the `all_exports` array on `Zcu`. An
`AnalUnit` which performs an export (either through an `export`
annotation or by containing an analyzed `@export`) gains an entry into
`single_exports` if it performs only one export, or `multi_exports` if
it performs multiple.
We no longer store a persistent mapping from a `Decl`/value to all
exports of that entity; this state is not necessary for the majority of
the pipeline. Instead, we construct it in `Zcu.processExports`, just
before flush. This does not affect the algorithmic complexity of
`processExports`, since this function already iterates all exports in
the `Zcu`.
The elimination of `decl_exports` and `value_exports` led to a few
non-trivial backend changes. The LLVM backend has been wrangled into a
more reasonable state in general regarding exports and externs. The C
backend is currently disabled in this commit, because its support for
`export` was quite broken, and that was exposed by this work -- I'm
hoping @jacobly0 will be able to pick this up!
|
|
|
|
This patch is a pure rename plus only changing the file path in
`@import` sites, so it is expected to not create version control
conflicts, even when rebasing.
|
|
|
|
|
|
|
|
Closes #14904
|
|
|
|
closes #5019
|
|
|
|
|
|
|
|
Certain symbols were left unmarked, meaning they would not be emit into
the final binary incorrectly. We now mark the synthetic symbols to ensure
they are emit as they are already created under the circumstance they're
needed for. This also re-enables disabled tests that were left disabled
in a previous merge conflict.
Lastly, this adds the shared-memory test to the test harnass as it was
previously forgotten and therefore regressed.
|
|
Rather than using the logger, we now emit proper 'compiler'-errors just
like the ELF and MachO linkers with notes. We now also support emitting
multiple errors before quiting the linking process in certain phases,
such as symbol resolution. This means we will print all symbols which
were resolved incorrectly, rather than the first one we encounter.
|
|
This introduces some type safety so we cannot accidently give an atom
index as a symbol index. This also means we do not have to store any
optionals and therefore allow for memory optimizations. Lastly, we can
now always simply access the symbol index of an atom, rather than having
to call `getSymbolIndex` as it is easy to forget.
|