public final class Pp2TokenStreamBuilder extends Object
TSourceTokenList to Pp2TokenStream.
Implements the Delphi initTokenArray semantics from
gsp_vcl/pp/sqlion.pas: walk the token list, fold every
ttwhitespace / ttreturn token into the
precedingBlanks and precedingLinebreaks counts of the
next non-whitespace token. Comments
(ttsimplecomment, ttbracketedcomment) are not
folded — they are first-class tokens in pp2 so downstream stages can
preserve, reanchor, or reflow them per CommentPolicy.
precedingLinebreaks is a count of logical new lines,
not raw characters. "\r\n" counts as one linebreak (not
two), as does a lone "\n" or a lone "\r". Downstream
layout rules in S25/S28 treat each logical linebreak as a new visual
line; counting CRLF as two would surface as a spurious blank line on
Windows-encoded scripts.
Byte-exact recovery of mixed line endings is not the spine's job —
the SourceSpanLedger (S8) records every byte of the original
input, including the precise CR/LF/CRLF sequence, so the region
assembler (S15) can restore them when emitting output.
After normalizing "\r\n" to "\n" in the original
input, summing across the produced stream:
Σ ( precedingLinebreaks + precedingBlanks + token.text.length() )
+ trailingBlanks + trailingLinebreaks
== normalized input length
The trailing trivia is returned alongside the stream so the source-span
ledger (S8) and the region assembler (S15) can place it back into the
output.
"\r\n" consumed together → 1 linebreak."\n" → 1 linebreak."\r" → 1 linebreak.countWhitespace() only sees lexer-classified
whitespace tokens, so non-whitespace UTF-8 characters never reach
it.| Modifier and Type | Class and Description |
|---|---|
static class |
Pp2TokenStreamBuilder.BuildResult
Result of a build: the stream of solid + comment tokens, plus any
trailing trivia characters that appeared after the last such token.
|
| Constructor and Description |
|---|
Pp2TokenStreamBuilder() |
| Modifier and Type | Method and Description |
|---|---|
Pp2TokenStreamBuilder.BuildResult |
build(TSourceTokenList source)
Build a stream from the given source token list.
|
static int[] |
countWhitespace(String text)
Classify each character in a whitespace-token's text as blank vs
linebreak.
|
static boolean |
isFoldable(ETokenType type)
Is this token type rolled into preceding-whitespace counters rather
than emitted as its own
Pp2Token? Only true whitespace
tokens — comments are first-class in pp2. |
public Pp2TokenStreamBuilder()
public Pp2TokenStreamBuilder.BuildResult build(TSourceTokenList source)
NullPointerException - if source is null or contains a
null elementpublic static boolean isFoldable(ETokenType type)
Pp2Token? Only true whitespace
tokens — comments are first-class in pp2.public static int[] countWhitespace(String text)
'\n' and '\r' are linebreaks; everything
else (space, tab, form-feed, vertical tab, NBSP, ...) is a blank.int[2] with [0]=blanks, [1]=linebreaks