public abstract class AbstractSqlParser extends Object implements SqlParser
This class implements the Template Method Pattern, defining the skeleton of the parsing algorithm while allowing subclasses to override specific steps. It provides default implementations for common operations and hooks for vendor-specific customization.
Design Pattern: Template Method
parse(ParserContext), tokenize(ParserContext)Parsing Algorithm (Template Method):
getLexer(ParserContext))performTokenization(ParserContext, TCustomLexer))processTokensBeforeParse(ParserContext, TSourceTokenList))getParser(ParserContext, TSourceTokenList))#performParsing(ParserContext, TCustomParser, TCustomParser, TSourceTokenList))performSemanticAnalysis(ParserContext, TStatementList))Subclass Responsibilities:
public class OracleSqlParser extends AbstractSqlParser {
public OracleSqlParser() {
super(EDbVendor.dbvoracle);
this.delimiterChar = '/';
}
// Must implement abstract methods
protected TCustomLexer getLexer(ParserContext context) {
return new TLexerOracle();
}
protected TCustomParser getParser(ParserContext context, TSourceTokenList tokens) {
return new TParserOracleSql(tokens);
}
// ... other abstract methods
// Optionally override hook methods
protected void processTokensBeforeParse(ParserContext context, TSourceTokenList tokens) {
// Oracle-specific token processing
}
}
SqlParser,
ParserContext,
SqlParseResult| Modifier and Type | Class and Description |
|---|---|
protected static class |
AbstractSqlParser.PreparedSqlReader |
| Modifier and Type | Field and Description |
|---|---|
protected String |
defaultDelimiterStr |
protected char |
delimiterChar |
protected Stack<gudusoft.gsqlparser.compiler.TFrame> |
frameStack
Frame stack for scope management during parsing.
|
protected gudusoft.gsqlparser.compiler.TContext |
globalContext
Global context for semantic analysis.
|
protected gudusoft.gsqlparser.compiler.TFrame |
globalFrame
Global frame pushed to frame stack during parsing.
|
protected TCustomLexer |
lexer
The lexer instance used for tokenization.
|
protected ParserContext |
parserContext
Current parser context for the ongoing parse operation.
|
protected TSourceTokenList |
sourcetokenlist
Token list container - created once in constructor, cleared before each parse.
|
protected ISqlCmds |
sqlcmds
SQL command resolver for identifying statement types (SELECT, INSERT, etc.).
|
protected TSQLEnv |
sqlEnv
SQL environment for semantic analysis.
|
protected TStatementList |
sqlstatements
Statement list container - created once in constructor, cleared before each extraction.
|
protected List<TSyntaxError> |
syntaxErrors |
protected EDbVendor |
vendor |
| Modifier | Constructor and Description |
|---|---|
protected |
AbstractSqlParser(EDbVendor vendor)
Construct parser for given database vendor.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
afterStatementParsed(TCustomSqlStatement stmt)
Hook method for vendor-specific post-processing after a statement is parsed.
|
protected int |
attemptErrorRecovery(TCustomSqlStatement statement,
int parseResult,
boolean onlyNeedRawParseTree)
Attempt error recovery for CREATE TABLE/INDEX statements with unsupported options.
|
protected void |
copyErrorsFromStatement(TCustomSqlStatement statement)
Copy error messages from a statement to the parser's error collection.
|
protected void |
doAfterTokenize(TSourceTokenList tokens)
Post-tokenization normalization.
|
TStatementList |
doExtractRawStatements(ParserContext context,
TSourceTokenList tokens)
Extract raw statements without full parsing (public API).
|
protected SqlParseResult |
extractRawStatements(ParserContext context,
TSourceTokenList tokens,
TCustomLexer lexer,
long tokenizationTimeMs)
Extract raw statements without full parsing.
|
protected abstract void |
extractVendorRawStatements(SqlParseResult.Builder builder)
Call vendor-specific raw statement extraction logic.
|
protected TSourceToken |
getanewsourcetoken()
Get next source token from the lexer.
|
String |
getDefaultDelimiterStr()
Get the default delimiter string for this vendor.
|
char |
getDelimiterChar()
Get the delimiter character for this vendor.
|
int |
getErrorCount()
Get the count of syntax errors.
|
protected abstract TCustomLexer |
getLexer(ParserContext context)
Get the lexer for this vendor.
|
protected abstract TCustomParser |
getParser(ParserContext context,
TSourceTokenList tokens)
Get the main parser for this vendor.
|
SqlParseResult |
getrawsqlstatements(ParserContext context)
Template method for extracting raw statements without full parsing.
|
protected TCustomParser |
getSecondaryParser(ParserContext context,
TSourceTokenList tokens)
Get secondary parser (e.g., PL/SQL for Oracle).
|
List<TSyntaxError> |
getSyntaxErrors()
Get the syntax errors collected during parsing.
|
EDbVendor |
getVendor()
Get the database vendor this parser handles.
|
protected void |
handleStatementParsingException(TCustomSqlStatement stmt,
int statementIndex,
Exception ex)
Handle exceptions that occur during individual statement parsing.
|
protected void |
initializeGlobalContext()
Initialize global context and frame stack for statement parsing.
|
protected boolean |
isDollarFunctionDelimiter(int tokencode,
EDbVendor dbVendor)
Check if a token is a dollar function delimiter ($$, $tag$, etc.) for PostgreSQL-family databases.
|
protected void |
onRawStatementComplete(ParserContext context,
TCustomSqlStatement statement,
TCustomParser mainParser,
TCustomParser secondaryParser,
TStatementList statementList,
boolean isLastStatement,
SqlParseResult.Builder builder)
Hook method called when a raw statement is complete.
|
protected void |
onRawStatementCompleteVendorSpecific(TCustomSqlStatement statement)
Hook for vendor-specific logic when a raw statement is completed.
|
SqlParseResult |
parse(ParserContext context)
Template method for full parsing.
|
protected void |
performInterpreter(ParserContext context,
TStatementList statements)
Perform interpretation/evaluation on parsed statements.
|
protected abstract TStatementList |
performParsing(ParserContext context,
TCustomParser parser,
TCustomParser secondaryParser,
TSourceTokenList tokens,
TStatementList rawStatements)
Perform actual parsing with syntax checking.
|
protected void |
performSemanticAnalysis(ParserContext context,
TStatementList statements)
Perform semantic analysis on parsed statements.
|
protected TSourceTokenList |
performTokenization(ParserContext context,
TCustomLexer lexer)
Perform tokenization using vendor-specific lexer.
|
protected AbstractSqlParser.PreparedSqlReader |
prepareSqlReader(ParserContext context) |
protected void |
processTokensBeforeParse(ParserContext context,
TSourceTokenList tokens)
Process tokens before parsing (vendor-specific adjustments).
|
protected void |
processTokensInTokenTable(ParserContext context,
TCustomLexer lexer,
TSourceTokenList tokens)
Process tokens using token table (vendor-specific token code adjustments).
|
void |
setTokenHandle(ITokenHandle tokenHandle)
Set an event handler which will be fired when a new source token is created by the lexer during tokenization.
|
protected abstract void |
setupVendorParsersForExtraction()
Setup vendor-specific parsers for raw statement extraction.
|
SqlParseResult |
tokenize(ParserContext context)
Template method for tokenization only (without full parsing).
|
protected abstract void |
tokenizeVendorSql()
Call vendor-specific tokenization logic.
|
String |
toString() |
protected String |
towinlinebreak(String s)
Convert line breaks to Windows format.
|
protected char delimiterChar
protected String defaultDelimiterStr
protected List<TSyntaxError> syntaxErrors
protected TSourceTokenList sourcetokenlist
This follows the component reuse pattern to avoid allocation overhead.
protected TStatementList sqlstatements
This follows the component reuse pattern to avoid allocation overhead.
protected ParserContext parserContext
Set at the beginning of each parse operation, contains input SQL and options.
protected ISqlCmds sqlcmds
Initialized lazily using SqlCmdsFactory.get(vendor) - vendor-specific implementation.
protected TCustomLexer lexer
Subclasses should set this field in their constructor to their specific lexer instance. This allows common tokenization logic in AbstractSqlParser to access the lexer generically.
protected gudusoft.gsqlparser.compiler.TContext globalContext
Created during performParsing phase, contains SQL environment and statement references.
protected TSQLEnv sqlEnv
Vendor-specific environment configuration, used by resolver and semantic analysis.
protected Stack<gudusoft.gsqlparser.compiler.TFrame> frameStack
Used to track nested scopes (global, statement, block-level) during parsing.
protected gudusoft.gsqlparser.compiler.TFrame globalFrame
Represents the outermost scope, must be popped after parsing completes.
protected AbstractSqlParser(EDbVendor vendor)
vendor - the database vendorpublic EDbVendor getVendor()
SqlParserpublic void setTokenHandle(ITokenHandle tokenHandle)
tokenHandle - the event handler to process the new created source tokenpublic final SqlParseResult parse(ParserContext context)
This method defines the skeleton of the parsing algorithm. Subclasses should NOT override this method; instead, they should override the abstract methods and hook methods called by this template.
Algorithm:
public final SqlParseResult tokenize(ParserContext context)
This method is used by getrawsqlstatements() which only
needs tokenization and raw statement extraction, without detailed
syntax checking or semantic analysis.
Algorithm:
public final SqlParseResult getrawsqlstatements(ParserContext context)
This method performs tokenization and raw statement extraction, but skips the expensive full parsing and semantic analysis steps.
Algorithm:
tokenize(ParserContext))extractRawStatements(ParserContext, TSourceTokenList, TCustomLexer, long))Equivalent to legacy API: TGSqlParser.getrawsqlstatements()
getrawsqlstatements in interface SqlParsercontext - immutable context with all inputsprotected abstract TCustomLexer getLexer(ParserContext context)
Subclass Responsibility: Return vendor-specific lexer instance. The lexer may be created fresh or cached/reused for performance.
Example:
protected TCustomLexer getLexer(ParserContext context) {
TLexerOracle lexer = new TLexerOracle();
lexer.delimiterchar = delimiterChar;
lexer.defaultDelimiterStr = defaultDelimiterStr;
return lexer;
}
context - the parser contextprotected abstract TCustomParser getParser(ParserContext context, TSourceTokenList tokens)
Subclass Responsibility: Return vendor-specific parser instance. The parser may be created fresh or cached/reused for performance. If reusing, the token list should be updated.
Example:
protected TCustomParser getParser(ParserContext context, TSourceTokenList tokens) {
TParserOracleSql parser = new TParserOracleSql(tokens);
parser.lexer = getLexer(context);
return parser;
}
context - the parser contexttokens - the source token listprotected TSourceTokenList performTokenization(ParserContext context, TCustomLexer lexer)
Template Method: This method implements the common tokenization
algorithm across all database vendors. Subclasses customize through one hook:
tokenizeVendorSql() - Call vendor-specific tokenization logic
Algorithm:
tokenizeVendorSql() hookcontext - parser context with SQL input configurationlexer - the lexer instance (same as this.flexer)RuntimeException - if tokenization failsprotected abstract void tokenizeVendorSql()
Hook Method: Called by performTokenization(gudusoft.gsqlparser.parser.ParserContext, gudusoft.gsqlparser.TCustomLexer) to execute
vendor-specific SQL-to-token conversion logic.
Subclass Responsibility: Call the vendor-specific tokenization method (e.g., dooraclesqltexttotokenlist, domssqlsqltexttotokenlist) which reads from lexer and populates sourcetokenlist.
Example (Oracle):
protected void tokenizeVendorSql() {
dooraclesqltexttotokenlist();
}
Example (MSSQL):
protected void tokenizeVendorSql() {
domssqlsqltexttotokenlist();
}
Example (PostgreSQL):
protected void tokenizeVendorSql() {
dopostgresqltexttotokenlist();
}
public final TStatementList doExtractRawStatements(ParserContext context, TSourceTokenList tokens)
This public method allows external callers (like TGSqlParser) to extract raw statements from an already-tokenized source list without re-tokenization.
doExtractRawStatements in interface SqlParsercontext - the parser contexttokens - the source token list (already tokenized)protected SqlParseResult extractRawStatements(ParserContext context, TSourceTokenList tokens, TCustomLexer lexer, long tokenizationTimeMs)
Template Method: This method implements the common algorithm for extracting raw statements across all database vendors. Subclasses customize the process through two hook methods:
setupVendorParsersForExtraction() - Initialize vendor parsersextractVendorRawStatements(SqlParseResult.Builder) - Call vendor extraction logicAlgorithm:
setupVendorParsersForExtraction() hookextractVendorRawStatements(SqlParseResult.Builder) hookcontext - the parser contexttokens - the source token listlexer - the lexer instance (for including in result)tokenizationTimeMs - tokenization time from tokenize() stepprotected abstract void setupVendorParsersForExtraction()
Hook Method: Called by extractRawStatements(gudusoft.gsqlparser.parser.ParserContext, gudusoft.gsqlparser.TSourceTokenList, gudusoft.gsqlparser.TCustomLexer, long) after initializing
sqlcmds but before calling the vendor-specific extraction logic.
Subclass Responsibility: Inject sqlcmds into vendor parser(s) and update their token lists. Examples:
Example (MSSQL):
protected void setupVendorParsersForExtraction() {
this.fparser.sqlcmds = this.sqlcmds;
this.fparser.sourcetokenlist = this.sourcetokenlist;
}
Example (Oracle with dual parsers):
protected void setupVendorParsersForExtraction() {
this.fparser.sqlcmds = this.sqlcmds;
this.fplsqlparser.sqlcmds = this.sqlcmds;
this.fparser.sourcetokenlist = this.sourcetokenlist;
this.fplsqlparser.sourcetokenlist = this.sourcetokenlist;
}
protected abstract void extractVendorRawStatements(SqlParseResult.Builder builder)
Hook Method: Called by extractRawStatements(gudusoft.gsqlparser.parser.ParserContext, gudusoft.gsqlparser.TSourceTokenList, gudusoft.gsqlparser.TCustomLexer, long) to execute
the vendor-specific logic for identifying statement boundaries.
Subclass Responsibility: Call the vendor-specific extraction method (e.g., dooraclegetrawsqlstatements, domssqlgetrawsqlstatements) passing the builder. The extraction method will populate the builder with raw statements.
Example (Oracle):
protected void extractVendorRawStatements(SqlParseResult.Builder builder) {
dooraclegetrawsqlstatements(builder);
}
Example (MSSQL):
protected void extractVendorRawStatements(SqlParseResult.Builder builder) {
domssqlgetrawsqlstatements(builder);
}
builder - the result builder to populate with raw statementsprotected abstract TStatementList performParsing(ParserContext context, TCustomParser parser, TCustomParser secondaryParser, TSourceTokenList tokens, TStatementList rawStatements)
Subclass Responsibility: Parse SQL using vendor-specific parser and optional secondary parser (e.g., PL/SQL for Oracle).
Important: This method receives raw statements that have already been
extracted by getrawsqlstatements(ParserContext). Subclasses should NOT
re-extract statements - just parse each statement to build the AST.
Example:
protected TStatementList performParsing(ParserContext context,
TCustomParser parser,
TCustomParser secondaryParser,
TSourceTokenList tokens,
TStatementList rawStatements) {
// Use the passed-in rawStatements (DO NOT re-extract!)
for (int i = 0; i < rawStatements.size(); i++) {
TCustomSqlStatement stmt = rawStatements.get(i);
stmt.parsestatement(...); // Build AST for each statement
}
return rawStatements;
}
context - the parser contextparser - the main parser instancesecondaryParser - secondary parser (may be null)tokens - the source token listrawStatements - raw statements already extracted (never null)protected TCustomParser getSecondaryParser(ParserContext context, TSourceTokenList tokens)
Hook Method: Default implementation returns null. Override if vendor needs a secondary parser. The parser may be created fresh or cached/reused for performance.
Example (Oracle):
protected TCustomParser getSecondaryParser(ParserContext context, TSourceTokenList tokens) {
TParserOraclePLSql plsqlParser = new TParserOraclePLSql(tokens);
plsqlParser.lexer = getLexer(context);
return plsqlParser;
}
context - the parser contexttokens - the source token listprotected void doAfterTokenize(TSourceTokenList tokens)
Handles matching parentheses wrapping around SQL and marks semicolons before closing parens to be ignored.
Extracted from: TGSqlParser.doAfterTokenize() (lines 5123-5161)
tokens - the source token list (mutable)protected void processTokensInTokenTable(ParserContext context, TCustomLexer lexer, TSourceTokenList tokens)
Currently handles BigQuery and Snowflake to convert DO keywords to identifiers when there's no corresponding WHILE/FOR.
Extracted from: TGSqlParser.processTokensInTokenTable() (lines 5186-5209)
context - the parser contextlexer - the lexer (for accessing TOKEN_TABLE)tokens - the source token list (mutable)protected void processTokensBeforeParse(ParserContext context, TSourceTokenList tokens)
Hook Method: Default implementation handles Snowflake consecutive semicolons. Override if vendor needs additional token preprocessing.
Extracted from: TGSqlParser.processTokensBeforeParse() (lines 5165-5184)
Example:
protected void processTokensBeforeParse(ParserContext context, TSourceTokenList tokens) {
super.processTokensBeforeParse(context, tokens); // Call base implementation
// Add vendor-specific processing...
}
context - the parser contexttokens - the source token list (mutable)protected void performSemanticAnalysis(ParserContext context, TStatementList statements)
Hook Method: Default implementation does nothing. Override to provide vendor-specific semantic analysis.
Typical Implementation:
context - the parser contextstatements - the parsed statements (mutable)protected void performInterpreter(ParserContext context, TStatementList statements)
Hook Method: Default implementation does nothing. Override to provide AST interpretation/evaluation.
Typical Implementation:
context - the parser contextstatements - the parsed statements (mutable)protected void copyErrorsFromStatement(TCustomSqlStatement statement)
This method should be called by performParsing implementations when a statement has syntax errors.
statement - the statement with errorsprotected int attemptErrorRecovery(TCustomSqlStatement statement, int parseResult, boolean onlyNeedRawParseTree)
When parsing CREATE TABLE or CREATE INDEX statements, the parser may encounter vendor-specific options that are not in the grammar. This method implements the legacy error recovery behavior by marking unsupported tokens after the main definition as SQL*Plus commands (effectively ignoring them).
Recovery Strategy:
When to call: After parsing a statement that has errors. Only recovers if ENABLE_ERROR_RECOVER_IN_CREATE_TABLE is true.
statement - the statement to attempt recovery onparseResult - the result code from parsing (0 = success)onlyNeedRawParseTree - whether only raw parse tree is neededpublic List<TSyntaxError> getSyntaxErrors()
public int getErrorCount()
protected boolean isDollarFunctionDelimiter(int tokencode, EDbVendor dbVendor)
Migrated from TGSqlParser.isDollarFunctionDelimiter() (lines 5074-5080).
Dollar-quoted strings are used in PostgreSQL-family databases to delimit function bodies. Each vendor has its own delimiter token code.
tokencode - the token code to checkdbVendor - the database vendorprotected void onRawStatementComplete(ParserContext context, TCustomSqlStatement statement, TCustomParser mainParser, TCustomParser secondaryParser, TStatementList statementList, boolean isLastStatement, SqlParseResult.Builder builder)
This method is called by vendor-specific raw statement extraction methods (e.g., dooraclegetrawsqlstatements) when a statement boundary is detected. It sets up the statement with parser references and adds it to the statement list.
context - parser contextstatement - the completed statementmainParser - main parser instancesecondaryParser - secondary parser instance (may be null)statementList - statement list to add toisLastStatement - true if this is the last statementbuilder - optional result builder (used during raw statement extraction, may be null)protected void onRawStatementCompleteVendorSpecific(TCustomSqlStatement statement)
Migrated from TGSqlParser.doongetrawsqlstatementevent() (lines 5129-5178).
This method is called after basic statement setup but before adding to the statement list. Subclasses can override to add vendor-specific token manipulations or metadata.
Default implementation handles PostgreSQL-family routine body processing.
statement - the completed statementprotected AbstractSqlParser.PreparedSqlReader prepareSqlReader(ParserContext context) throws IOException
IOExceptionprotected void initializeGlobalContext()
This method sets up the semantic analysis infrastructure required during the parsing phase. It creates:
When to call: At the beginning of performParsing(), before parsing statements.
Cleanup required: Must call globalFrame.popMeFromStack(frameStack)
after all statements are parsed to clean up the frame stack.
Extracted from: Identical implementations in OracleSqlParser and MssqlSqlParser to eliminate ~16 lines of duplicate code per parser.
protected void handleStatementParsingException(TCustomSqlStatement stmt, int statementIndex, Exception ex)
This method provides robust error handling that allows parsing to continue even when individual statements throw exceptions. It:
TSyntaxError with exception informationsyntaxErrors list for user feedbackBenefits:
Example error message:
"Exception during parsing statement 3: NullPointerException - Cannot invoke..."
Extracted from: Identical implementations in OracleSqlParser and MssqlSqlParser to eliminate ~51 lines of duplicate code per parser.
stmt - the statement that failed to parsestatementIndex - 0-based index of the statement in the statement listex - the exception that was thrown during parsingprotected void afterStatementParsed(TCustomSqlStatement stmt)
This method is called after each statement is successfully parsed but before error recovery and error collection. Subclasses can override this to perform vendor-specific operations such as:
Default implementation: Does nothing (no-op).
Example override (Oracle):
{@codestmt - the statement that was just parsedprotected TSourceToken getanewsourcetoken()
This method wraps the lexer's yylexwrap() call and performs several important tasks:
Token Consolidation Rules:
Implementation Note: This method is extracted from TGSqlParser.getanewsourcetoken() and made available to all database-specific parsers to avoid code duplication.
protected String towinlinebreak(String s)
Currently returns the input unchanged. This method exists for compatibility with the original TGSqlParser implementation.
s - Input stringpublic char getDelimiterChar()
public String getDefaultDelimiterStr()