Class SparkSqlParser

Object
gudusoft.gsqlparser.parser.AbstractSqlParser
gudusoft.gsqlparser.parser.SparkSqlParser
All Implemented Interfaces:
SqlParser

public class SparkSqlParser extends AbstractSqlParser
Apache Spark SQL parser implementation.

This parser handles SparkSQL-specific SQL syntax including:

  • SparkSQL DML/DDL operations
  • Special token handling for DATE, TIME, TIMESTAMP, INTERVAL
  • MySQL-style SOURCE commands
  • Stored procedures, functions, and triggers

Implementation Status: MIGRATED

  • Completed: Full migration from TGSqlParser to AbstractSqlParser
  • Tokenization: dosparksqltexttotokenlist()
  • Raw Extraction: dosparksqlgetrawsqlstatements()
  • Parsing: Fully self-contained using TParserSparksql
Since:
3.2.0.0
See Also:
  • Constructor Details

    • SparkSqlParser

      public SparkSqlParser()
      Construct SparkSQL parser.

      Configures the parser for SparkSQL with default delimiter (;).

      Following the original TGSqlParser pattern (lines 1285-1293), the lexer and parser are created once in the constructor and reused for all parsing operations to avoid unnecessary object allocation overhead.

  • Method Details

    • getVendor

      public EDbVendor getVendor()
      Description copied from interface: SqlParser
      Get the database vendor this parser handles.
      Specified by:
      getVendor in interface SqlParser
      Overrides:
      getVendor in class AbstractSqlParser
      Returns:
      the database vendor (e.g., dbvoracle, dbvmysql)
    • getLexer

      protected TCustomLexer getLexer(ParserContext context)
      Description copied from class: AbstractSqlParser
      Get the lexer for this vendor.

      Subclass Responsibility: Return vendor-specific lexer instance. The lexer may be created fresh or cached/reused for performance.

      Example:

       protected TCustomLexer getLexer(ParserContext context) {
           TLexerOracle lexer = new TLexerOracle();
           lexer.delimiterchar = delimiterChar;
           lexer.defaultDelimiterStr = defaultDelimiterStr;
           return lexer;
       }
       
      Specified by:
      getLexer in class AbstractSqlParser
      Parameters:
      context - the parser context
      Returns:
      configured lexer instance (never null)
    • getParser

      protected TCustomParser getParser(ParserContext context, TSourceTokenList tokens)
      Description copied from class: AbstractSqlParser
      Get the main parser for this vendor.

      Subclass Responsibility: Return vendor-specific parser instance. The parser may be created fresh or cached/reused for performance. If reusing, the token list should be updated.

      Example:

       protected TCustomParser getParser(ParserContext context, TSourceTokenList tokens) {
           TParserOracleSql parser = new TParserOracleSql(tokens);
           parser.lexer = getLexer(context);
           return parser;
       }
       
      Specified by:
      getParser in class AbstractSqlParser
      Parameters:
      context - the parser context
      tokens - the source token list
      Returns:
      configured parser instance (never null)
    • getSecondaryParser

      Description copied from class: AbstractSqlParser
      Get secondary parser (e.g., PL/SQL for Oracle).

      Hook Method: Default implementation returns null. Override if vendor needs a secondary parser. The parser may be created fresh or cached/reused for performance.

      Example (Oracle):

       protected TCustomParser getSecondaryParser(ParserContext context, TSourceTokenList tokens) {
           TParserOraclePLSql plsqlParser = new TParserOraclePLSql(tokens);
           plsqlParser.lexer = getLexer(context);
           return plsqlParser;
       }
       
      Overrides:
      getSecondaryParser in class AbstractSqlParser
      Parameters:
      context - the parser context
      tokens - the source token list
      Returns:
      secondary parser instance, or null if not needed
    • tokenizeVendorSql

      protected void tokenizeVendorSql()
      Hook method for vendor-specific tokenization.

      Delegates to dosparksqltexttotokenlist() which implements SparkSQL-specific token processing logic.

      Specified by:
      tokenizeVendorSql in class AbstractSqlParser
    • setupVendorParsersForExtraction

      Hook method to setup parsers before raw statement extraction.

      Injects sqlcmds and sourcetokenlist into the SparkSQL parser.

      Specified by:
      setupVendorParsersForExtraction in class AbstractSqlParser
    • extractVendorRawStatements

      Hook method for vendor-specific raw statement extraction.

      Delegates to dosparksqlgetrawsqlstatements() which implements SparkSQL-specific statement boundary detection.

      Specified by:
      extractVendorRawStatements in class AbstractSqlParser
      Parameters:
      builder - the result builder to populate with raw statements
    • performParsing

      protected TStatementList performParsing(ParserContext context, TCustomParser mainParser, TCustomParser secondaryParser, TSourceTokenList tokens, TStatementList rawStatements)
      Description copied from class: AbstractSqlParser
      Perform actual parsing with syntax checking.

      Subclass Responsibility: Parse SQL using vendor-specific parser and optional secondary parser (e.g., PL/SQL for Oracle).

      Important: This method receives raw statements that have already been extracted by AbstractSqlParser.getrawsqlstatements(ParserContext). Subclasses should NOT re-extract statements - just parse each statement to build the AST.

      Example:

       protected TStatementList performParsing(ParserContext context,
                                               TCustomParser parser,
                                               TCustomParser secondaryParser,
                                               TSourceTokenList tokens,
                                               TStatementList rawStatements) {
           // Use the passed-in rawStatements (DO NOT re-extract!)
           for (int i = 0; i < rawStatements.size(); i++) {
               TCustomSqlStatement stmt = rawStatements.get(i);
               stmt.parsestatement(...);  // Build AST for each statement
           }
           return rawStatements;
       }
       
      Specified by:
      performParsing in class AbstractSqlParser
      Parameters:
      context - the parser context
      mainParser - the main parser instance
      secondaryParser - secondary parser (may be null)
      tokens - the source token list
      rawStatements - raw statements already extracted (never null)
      Returns:
      statement list with parsed AST (never null)
    • afterStatementParsed

      Hook for vendor-specific post-processing after statement is parsed.

      Default implementation is no-op for SparkSQL.

      Overrides:
      afterStatementParsed in class AbstractSqlParser
      Parameters:
      stmt - the statement that was just parsed
    • handleCreateTableErrorRecovery

      Handle error recovery for CREATE TABLE statements.

      Migrated from TGSqlParser.handleCreateTableErrorRecovery() (lines 16916-16971).

      SparkSQL allows table properties that may not be fully parsed. This method marks unparseable properties as SQL*Plus commands to skip them.

    • performSemanticAnalysis

      protected void performSemanticAnalysis(ParserContext context, TStatementList statements)
      Description copied from class: AbstractSqlParser
      Perform semantic analysis on parsed statements.

      Hook Method: Default implementation does nothing. Override to provide vendor-specific semantic analysis.

      Typical Implementation:

      • Column-to-table resolution (TSQLResolver)
      • Dataflow analysis
      • Reference resolution
      • Scope resolution
      Overrides:
      performSemanticAnalysis in class AbstractSqlParser
      Parameters:
      context - the parser context
      statements - the parsed statements (mutable)
    • performInterpreter

      protected void performInterpreter(ParserContext context, TStatementList statements)
      Description copied from class: AbstractSqlParser
      Perform interpretation/evaluation on parsed statements.

      Hook Method: Default implementation does nothing. Override to provide AST interpretation/evaluation.

      Typical Implementation:

      • Execute simple SQL statements
      • Evaluate expressions
      • Constant folding
      • Static analysis
      Overrides:
      performInterpreter in class AbstractSqlParser
      Parameters:
      context - the parser context
      statements - the parsed statements (mutable)
    • toString

      public String toString()
      Overrides:
      toString in class AbstractSqlParser