diff options
| author | Tom Read Cutting <moosichu@users.noreply.github.com> | 2023-02-19 12:14:03 +0000 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2023-02-19 14:14:03 +0200 |
| commit | 346ec15c5005e523c2a1d4b967ee7a4e5d1e9775 (patch) | |
| tree | 16f8b1bc34b30421f40c7d3b2aae5a770fe732b4 /doc/langref.html.in | |
| parent | 281d4c0ff6f95de0090f3621b4bdb651cd3d0330 (diff) | |
| download | zig-346ec15c5005e523c2a1d4b967ee7a4e5d1e9775.tar.gz zig-346ec15c5005e523c2a1d4b967ee7a4e5d1e9775.zip | |
Correctly handle carriage return characters according to the spec (#12661)
* Scan from line start when finding tag in tokenizer
This resolves a crash that can occur for invalid bytes like carriage
returns that are valid characters when not parsed from within literals.
There are potentially other edge cases this could resolve as well, as
the calling code for this function didn't account for any potential
'pending_invalid_tokens' that could be queued up by the tokenizer from
within another state.
* Fix carriage return crash in multiline string
Follow the guidance of #38:
> However CR directly before NL is interpreted as only a newline and not part of the multiline string. zig fmt will delete the CR.
Zig fmt already had code for deleting carriage returns, but would still
crash - now it no longer does so. Carriage returns encountered before
line-feeds are now appropriately removed on program compilation as well.
* Only accept carriage returns before line feeds
Previous commit was much less strict about this, this more closely
matches the desired spec of only allow CR characters in a CRLF pair, but
not otherwise.
* Fix CR being rejected when used as whitespace
Missed this comment from ziglang/zig-spec#83:
> CR used as whitespace, whether directly preceding NL or stray, is still unambiguously whitespace. It is accepted by the grammar and replaced by the canonical whitespace by zig fmt.
* Add tests for carriage return handling
Diffstat (limited to 'doc/langref.html.in')
| -rw-r--r-- | doc/langref.html.in | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/doc/langref.html.in b/doc/langref.html.in index a74d06ccbf..8ef31558d5 100644 --- a/doc/langref.html.in +++ b/doc/langref.html.in @@ -11551,7 +11551,8 @@ fn readU32Be() u32 {} </p> <p> Each LF may be immediately preceded by a single CR (byte value 0x0d, code point U+000d, {#syntax#}'\r'{#endsyntax#}) - to form a Windows style line ending, but this is discouraged. + to form a Windows style line ending, but this is discouraged. Note that in mulitline strings, CRLF sequences will + be encoded as LF when compiled into a zig program. A CR in any other context is not allowed. </p> <p> |
