public class HiveSqlParser extends AbstractSqlParser
This parser handles Hive-specific SQL syntax including:
Design Notes:
AbstractSqlParser using the template method patternTLexerHive for tokenizationTParserHive for parsingUsage Example:
// Get Hive parser from factory
SqlParser parser = SqlParserFactory.get(EDbVendor.dbvhive);
// Build context
ParserContext context = new ParserContext.Builder(EDbVendor.dbvhive)
.sqlText("SELECT * FROM `default.employee` WHERE dept = 'IT'")
.build();
// Parse
SqlParseResult result = parser.parse(context);
// Access statements
TStatementList statements = result.getSqlStatements();
SqlParser,
AbstractSqlParser,
TLexerHive,
TParserHiveAbstractSqlParser.PreparedSqlReader| Modifier and Type | Field and Description |
|---|---|
TLexerHive |
flexer
The Hive lexer used for tokenization
|
defaultDelimiterStr, delimiterChar, frameStack, globalContext, globalFrame, lexer, sourcetokenlist, sqlcmds, sqlEnv, sqlstatements, syntaxErrors, vendor| Constructor and Description |
|---|
HiveSqlParser()
Construct Hive SQL parser.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
extractVendorRawStatements(SqlParseResult.Builder builder)
Call Hive-specific raw statement extraction logic.
|
protected TCustomLexer |
getLexer(ParserContext context)
Return the Hive lexer instance.
|
protected TCustomParser |
getParser(ParserContext context,
TSourceTokenList tokens)
Return the Hive SQL parser instance with updated token list.
|
protected TCustomParser |
getSecondaryParser(ParserContext context,
TSourceTokenList tokens)
Hive does not use a secondary parser (unlike Oracle with PL/SQL).
|
protected void |
performInterpreter(ParserContext context,
TStatementList statements)
Perform interpretation/evaluation on statements.
|
protected TStatementList |
performParsing(ParserContext context,
TCustomParser parser,
TCustomParser secondaryParser,
TSourceTokenList tokens,
TStatementList rawStatements)
Parse all raw SQL statements.
|
protected void |
performSemanticAnalysis(ParserContext context,
TStatementList statements)
Perform semantic analysis on parsed statements.
|
protected void |
setupVendorParsersForExtraction()
Setup Hive parser for raw statement extraction.
|
protected void |
tokenizeVendorSql()
Call Hive-specific tokenization logic.
|
String |
toString() |
afterStatementParsed, attemptErrorRecovery, copyErrorsFromStatement, doAfterTokenize, doExtractRawStatements, extractRawStatements, getanewsourcetoken, getDefaultDelimiterStr, getDelimiterChar, getErrorCount, getrawsqlstatements, getSyntaxErrors, getVendor, handleStatementParsingException, initializeGlobalContext, isDollarFunctionDelimiter, onRawStatementComplete, onRawStatementCompleteVendorSpecific, parse, performTokenization, prepareSqlReader, processTokensBeforeParse, processTokensInTokenTable, setTokenHandle, tokenize, towinlinebreakpublic TLexerHive flexer
public HiveSqlParser()
Configures the parser for Hive database with default delimiter (;).
Following the original TGSqlParser pattern, the lexer and parser are created once in the constructor and reused for all parsing operations.
protected TCustomLexer getLexer(ParserContext context)
getLexer in class AbstractSqlParsercontext - the parser contextprotected TCustomParser getParser(ParserContext context, TSourceTokenList tokens)
getParser in class AbstractSqlParsercontext - the parser contexttokens - the source token listprotected TCustomParser getSecondaryParser(ParserContext context, TSourceTokenList tokens)
getSecondaryParser in class AbstractSqlParsercontext - the parser contexttokens - the source token listprotected void tokenizeVendorSql()
Delegates to dohivetexttotokenlist which handles Hive's specific keyword recognition, backtick-quoted identifiers, and qualified name splitting.
tokenizeVendorSql in class AbstractSqlParserprotected void setupVendorParsersForExtraction()
Hive uses a single parser, so we inject sqlcmds and update the token list for the main parser only.
setupVendorParsersForExtraction in class AbstractSqlParserprotected void extractVendorRawStatements(SqlParseResult.Builder builder)
Delegates to dohivegetrawsqlstatements which handles Hive's statement delimiters (semicolons).
Note: parserContext is already set by AbstractSqlParser before this is called
extractVendorRawStatements in class AbstractSqlParserbuilder - the result builder to populate with raw statementsprotected TStatementList performParsing(ParserContext context, TCustomParser parser, TCustomParser secondaryParser, TSourceTokenList tokens, TStatementList rawStatements)
This method performs full syntax analysis of each statement:
Migrated from TGSqlParser.performParsing()
performParsing in class AbstractSqlParsercontext - the parser contextparser - the main parser (TParserHive)secondaryParser - the secondary parser (null for Hive)tokens - the source token listrawStatements - raw statements already extracted (never null)protected void performSemanticAnalysis(ParserContext context, TStatementList statements)
Runs TSQLResolver to build relationships between tables and columns, resolve references, and perform type checking.
performSemanticAnalysis in class AbstractSqlParsercontext - the parser contextstatements - the parsed statements (mutable)protected void performInterpreter(ParserContext context, TStatementList statements)
Runs TASTEvaluator for compile-time constant expression evaluation. Hive does not require interpretation currently.
performInterpreter in class AbstractSqlParsercontext - the parser contextstatements - the parsed statements (mutable)public String toString()
toString in class AbstractSqlParser