public final class ParseRecoveryEngine extends Object
StatementBoundaryDetector.detect(Pp2TokenStream, EDbVendor) and
tags each one as AST_OK,
AST_ERROR, or
TRIVIA.
parseAll(List) pre-marks every TRIVIA range and every
non-trivia range whose source slice exceeds
Pp2FormatOptions.maxRegionParseChars (the latter becomes
AST_ERROR with an engine note). Every remaining range is parsed
individually. The pool's single parser serves as the probe; if the
probe parse succeeds, the engine re-parses the same slice into a freshly
allocated TGSqlParser so the outcome's
RegionParseOutcome.getParser() contains exactly one statement and
can be handed to FormatterFactory.pp(parser, opt) without
rendering its siblings. AST_ERROR outcomes do not need a
dedicated parser — the lexical fallback (S14 / S31) renders them.
This is plan §13/R1's "fresh parser per region" contingency adopted as
the default. The cost is one extra parser allocation and re-parse per
AST_OK region; the benefit is that every outcome returned from a
single parseAll call is mutually valid in any consumption order
and GuardedAstDelegate (S13) can safely invoke pp's
statement-iterating renderer on each one without observing neighbours.
A previous draft of this engine attempted a "lazy full-script first"
optimisation: one parse() on the entire original SQL, with all
happy-path outcomes sharing the parser snapshot. That broke
FormatterFactory.pp(parser, opt) which iterates every statement
in parser.sqlstatements — each outcome would have rendered the
whole script. The optimisation is deferred (plan §16/Q3 / S37) and may
be re-introduced once a render API that targets a single statement
exists.
Pp2FormatOptions.maxRegionParseChars skip the parse attempt
entirely and are returned as AST_ERROR with an engine note —
the lexical fallback (S14 / S31) handles the rendering.Throwable thrown by TGSqlParser.parse() is
caught and converted to an AST_ERROR outcome. The pool is
still reset before the next attempt, so a single misbehaving region
does not corrupt the rest of the script.TRIVIA without invoking the parser.The pool is size-1. Each parseRegion(StatementRange) call
overwrites the prior region's parser state in place, which means a
parser reference from an earlier
parseRegion call becomes stale once the next call returns. See
the outcome class Javadoc for the contract.
parseAll(List), by contrast, allocates a fresh parser per
AST_OK region, so outcomes returned from a single
parseAll call are mutually valid and outlive the engine call
itself (each carries an isolated parser of its own).
This class is single-threaded by construction (the pool's threading contract); concurrent use is undefined.
Plan reference: §5.1, §7.3/S12, §7.4/S12, §10.4, §13/R1.
| Constructor and Description |
|---|
ParseRecoveryEngine(EDbVendor vendor,
String originalSql,
Pp2TokenStream stream,
Pp2FormatOptions opts) |
| Modifier and Type | Method and Description |
|---|---|
ParserPool |
getPool()
Pool access — exposed for tests / introspection.
|
EDbVendor |
getVendor()
EDbVendor this engine targets.
|
boolean |
isTrivia(StatementRange range)
True when the range contains no solid tokens — only whitespace,
comments, and at most a trailing terminator (
; or
vendor-specific GO). |
List<RegionParseOutcome> |
parseAll(List<StatementRange> ranges)
Parse all ranges.
|
RegionParseOutcome |
parseRegion(StatementRange range)
Parse a single region.
|
String |
sliceFor(StatementRange range)
Source slice covered by the range.
|
public ParseRecoveryEngine(EDbVendor vendor, String originalSql, Pp2TokenStream stream, Pp2FormatOptions opts)
public ParserPool getPool()
public List<RegionParseOutcome> parseAll(List<StatementRange> ranges)
TRIVIA range up front (no parser involvement).Pp2FormatOptions.maxRegionParseChars as AST_ERROR
with an engine note.AST_OK, re-parse the slice into a freshly-allocated
TGSqlParser so the outcome's parser holds exactly one
statement and is safe to hand to
FormatterFactory.pp(parser, opt).ranges - non-null list of statement ranges, in source orderNullPointerException - if ranges or any element is nullpublic RegionParseOutcome parseRegion(StatementRange range)
Behaviour:
trivia,
returns TRIVIA without
touching the parser.opts.maxRegionParseChars, returns
AST_ERROR with an engine note. Skips parsing entirely.Throwable from
parse() is wrapped as AST_ERROR.public boolean isTrivia(StatementRange range)
; or
vendor-specific GO). Trivia ranges are short-circuited to
TRIVIA and never reach the parser.
The last-token-as-terminator exclusion is keyed off
StatementRange.getTerminator() rather than text matching, so a
mid-statement identifier that happens to spell GO is still
treated as solid.
public String sliceFor(StatementRange range)
When the range's terminator is StatementRange.Terminator.GO,
the trailing GO token is excluded from the slice. The GSP
MSSQL parser treats GO as a batch separator that splits the
stream into multiple sqlstatements, which would otherwise
trigger RegionParseOutcome's multi-statement guard. Stripping
GO from the per-region slice keeps the parse contract
single-statement.