public class TGSqlParser extends Object
a database vendor
,
then set SQL script text via setSqltext(java.lang.String)
method or reading the input SQL from a file via
setSqlfilename(java.lang.String)
method.
After that, call one of the following methods to achieve what you need:
tokenizeSqltext()
, turns the input SQL into a sequence of token which is the
basic lexis element of SQL syntax. Token is categorized as keyword, identifier,
number, operator, whitespace and other types. All source tokens can be fetched
via the getSourcetokenlist()
methodgetrawsqlstatements()
, separates the SQL statements in the input SQL script without
doing syntax check, use the getSqlstatements()
method to get a list of SQL statements
which is the sub-class of TCustomSqlStatement
, get SQL statement type
via the TCustomSqlStatement.sqlstatementtype
field, and string representation of
each SQL statement via the TParseTreeNode.toString()
method. All source tokens in this SQL statement
is available by using TCustomSqlStatement.sourcetokenlist
filed.
Since no parse tree is built by calling this method, no further detailed information about the SQL statement is available.
parse()
, Check syntax of the input SQL, doing some kind of semantic analysis without connecting to
a real database.
This method will do a in-depth analysis of the input SQL such as building the link between table and columns.
The parse tree of the input SQL is available after calling this method.
getErrormessage()
method.
The syntax error in one SQL statement doesn't prevent the parser continue to check the syntax of the next SQL statement.
After checking syntax of all SQL statements, use the getErrorCount()
method to get the total number of errors.
A syntax error in a SQL stored procedure will cease this parser to check syntax of the rest SQL statements
in this stored procedure.
Format SQL script can be done after calling parse()
.
int ret = sqlparser.parse();
if (ret == 0){
GFmtOpt option = GFmtOptFactory.newInstance();
String result = FormatterFactory.pp(sqlparser, option);
System.out.println(result);
}else{
System.out.println(sqlparser.getErrormessage());
}
After paring SQL script, all parse tree nodes are available for use, some of use cases are:
Typically, SQL parse tree nodes generated by this SQL Parser were closely related to SQL elements defined in database vendor's SQL reference book. here is a brief summary of some most used SQL elements and corresponding classes defined in this SQL parser.
TObjectName
TConstant
TTypeName
TFunctionCall
TConstraint
TExpression
TResultColumn
gudusoft.gsqlparser.nodes
Some major SQL statements:
TSelectSqlStatement
TDeleteSqlStatement
TInsertSqlStatement
TUpdateSqlStatement
TCreateTableSqlStatement
gudusoft.gsqlparser.stmt
Stored procedure
TDb2CreateFunction
,
TMssqlCreateProcedure
,
TMySQLCreateFunction
,
TPlsqlCreateFunction
TDb2CreateProcedure
,
TMssqlCreateProcedure
,
TMySQLCreateProcedure
,
TPlsqlCreateProcedure
TCreateTriggerStmt
,
TPlsqlCreateTrigger
TPlsqlCreatePackage
Modifier and Type | Field and Description |
---|---|
static EDbVendor |
currentDBVendor |
TSourceTokenList |
sourcetokenlist
Tokens generated by lexer from the input SQL script.
|
String |
sqlfilename
The input SQL will be read from this file
If field is specified, then
sqltext will be ignored. |
TStatementList |
sqlstatements
SQL statements generated by this parser from the input SQL script.
|
String |
sqltext
The input SQL Text.
|
Constructor and Description |
---|
TGSqlParser(EDbVendor pdbvendor)
Class constructor, create a new instance of the parser by setting the database vendor
|
Modifier and Type | Method and Description |
---|---|
int |
checkSyntax()
check syntax of the input SQL.
|
void |
freeParseTable()
Not used.
|
EDbVendor |
getDbVendor()
The database vendor specified when creating this parser.
|
static EDbVendor |
getDBVendorByName(String dbVendorName)
Turn the string name of database to dbvendor
access: EDbVendor.dbvccess
ansi: EDbVendor.dbvansi
bigquery: EDbVendor.dbvbigquery
couchbase: EDbVendor.dbvcouchbase
dax: EDbVendor.dbvdax
db2: EDbVendor.dbvdb2
firebird: EDbVendor.dbvfirebird
generic: EDbVendor.dbvgeneric
greenplum: EDbVendor.dbvgreenplum
hana: EDbVendor.dbvhana
hive: EDbVendor.dbvhive
impala: EDbVendor.dbvimpala
informix: EDbVendor.dbvinformix
mdx: EDbVendor.dbvmdx
mssql or sqlserver: EDbVendor.dbvmssql
mysql: EDbVendor.dbvmysql
netezza: EDbVendor.dbvnetezza
odbc: EDbVendor.dbvodbc
openedge: EDbVendor.dbvopenedge
oracle: EDbVendor.dbvoracle
postgresql or postgres: EDbVendor.dbvpostgresql
redshift: EDbVendor.dbvredshift
snowflake: EDbVendor.dbvsnowflake
sybase: EDbVendor.dbvsybase
teradata: EDbVendor.dbvteradata
vertica: EDbVendor.dbvvertica
|
int |
getErrorCount()
The total number of syntax errors founded in the input SQL script.
|
String |
getErrormessage()
The text of error message generated by iterating all items in
getSyntaxErrors() . |
TCustomLexer |
getFlexer()
The lexer which is used to tokenize the input SQL.
|
Stack<gudusoft.gsqlparser.compiler.TFrame> |
getFrameStack() |
long |
getInterpreterTime()
Get accumulated time spent in interpreter in milliseconds
|
static String |
getLicenseMessage()
Not used.
|
static String |
getLicenseType()
Not used.
|
static String |
getMachineId()
Not used.
|
IMetaDatabase |
getMetaDatabase()
Deprecated.
As of v2.0.3.1, please use
getSqlEnv() instead
a new instance of the class which implements the IMetaDatabase interface |
long |
getParsingTime()
Get accumulated time spent parsing in milliseconds
|
int |
getrawsqlstatements()
separates the SQL statements in the input SQL script without doing syntax check.
|
long |
getRawSqlStatementsTime()
Get accumulated time spent getting raw SQL statements in milliseconds
|
long |
getSemanticAnalysisTime()
Get accumulated time spent on semantic analysis in milliseconds
|
TSourceTokenList |
getSourcetokenlist()
A sequence of source tokens created by the lexer after tokenize the input SQL
|
String |
getSqlCharset() |
TSQLEnv |
getSqlEnv()
SQL environment includes the database metadata such as procedure, function, trigger, table and etc.
|
String |
getSqlfilename()
The input SQL filename.
|
InputStream |
getSqlInputStream()
the InputStream from which SQL will be read
|
TStatementList |
getSqlstatements()
A list of SQL statements created by the parser.
|
String |
getSqltext()
The SQL text that being processed.
|
ArrayList<TSyntaxError> |
getSyntaxErrors()
The array of syntax error generated by the parser during checking the syntax of the input SQL,
element of this list is type of
TSyntaxError |
String |
getTimeStatistics()
Returns time statistics as a formatted string
|
long |
getTotalTime()
Get total accumulated time spent in all steps
|
static String |
getUserName()
Not used.
|
boolean |
isTimeLoggingEnabled()
Check if time logging is enabled
|
int |
parse()
Check syntax of the input SQL, doing some kind of semantic analysis without connecting to a real database.
|
static TConstant |
parseConstant(EDbVendor dbVendor,
String newConstant) |
TConstant |
parseConstant(String newConstant)
Create an constant object from the parameter: newConstant
|
static TExpression |
parseExpression(EDbVendor dbVendor,
String expr) |
TExpression |
parseExpression(String expr)
Create an expression object from the parameter: expr
|
static TFunctionCall |
parseFunctionCall(EDbVendor dbVendor,
String newFunction) |
TFunctionCall |
parseFunctionCall(String newFunction)
Create a function object from the parameter: newFunction
|
static TObjectName |
parseObjectName(EDbVendor dbVendor,
String newObjectName) |
TObjectName |
parseObjectName(String newObjectName)
Create a database objectName from the parameter: newObjectName
|
static TSelectSqlStatement |
parseSubquery(EDbVendor dbVendor,
String subquery) |
TSelectSqlStatement |
parseSubquery(String subquery)
Create a select statement object from the parameter: subquery
|
void |
resetTimeCounters()
Reset all accumulated time counters to zero
|
void |
setEnablePartialParsing(boolean enablePartialParsing)
Deprecated.
As of v1.4.3.4
enable GSP to parse the rest of sql statements inside stored procedure
when a SQL statement in the stored procedure cannot be parsed
Available to parse sybase stored procedure currently. |
void |
setEnableTimeLogging(boolean enable)
Enable or disable time logging for parser steps
|
void |
setFrameStack(Stack<gudusoft.gsqlparser.compiler.TFrame> frameStack) |
void |
setMetaDatabase(IMetaDatabase metaDatabase)
Deprecated.
As of v2.0.3.1, please use
getSqlEnv() instead
set an instance of a class which implement the interface: IMetaDatabase .
The parser will call IMetaDatabase.checkColumn(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.lang.String) method when it needs to know
whether a column is belonged to a table.
The class that implements the interface: IMetaDatabase usually fetch the metadata from the database
by connecting to a database instance.
If the class is not provided, the parser has to guess the relationship between a un-qualified column and table
in the input SQL which may lead to a un-determined result between the column and table. |
void |
setOnlyNeedRawParseTree(boolean onlyNeedRawParseTree) |
void |
setSinglePLBlock(boolean singlePLBlock) |
void |
setSqlCharset(String sqlCharset) |
void |
setSqlEnv(TSQLEnv sqlEnv) |
void |
setSqlfilename(String sqlfilename)
set the filename from which the input SQL will be read.
|
void |
setSqlInputStream(InputStream sqlInputStream)
set the InputStream from which SQL will be read.
|
void |
setSqlStatementHandle(ISQLStatementHandle sqlStatementHandle) |
void |
setSqltext(String sqltext)
set the input SQL text, If
sqlfilename is specified before this method, the parser will using
the SQL text in this field instead of read SQL from sqlfilename . |
void |
setTeradataUtilityType(TeradataUtilityType teradataUtilityType) |
void |
setTokenHandle(ITokenHandle tokenHandle)
Set an event handler which will be fired when a new source token is created by the lexer during tokenize the
input SQL.
|
void |
setTokenListHandle(ITokenListHandle tokenListHandle) |
void |
tokenizeSqltext()
turns the input SQL into a sequence of token which is the
basic lexis element of SQL syntax.
|
int |
validate() |
public static EDbVendor currentDBVendor
public String sqltext
sqlfilename
is specified, then this field will be ignored.public String sqlfilename
sqltext
will be ignored.
This must be the full path to the file, relative path doesn't work.public TSourceTokenList sourcetokenlist
public TStatementList sqlstatements
public TGSqlParser(EDbVendor pdbvendor)
pdbvendor
- the database vendor whose grammar rule will be used to validate the syntax of the input SQLpublic void setSqlCharset(String sqlCharset)
public String getSqlCharset()
public static EDbVendor getDBVendorByName(String dbVendorName)
dbVendorName
- public Stack<gudusoft.gsqlparser.compiler.TFrame> getFrameStack()
public void setFrameStack(Stack<gudusoft.gsqlparser.compiler.TFrame> frameStack)
public TSourceTokenList getSourcetokenlist()
public TStatementList getSqlstatements()
getrawsqlstatements()
method, the syntax of each SQL statement
is not checked and the parse tree of each statement is not created. If the parse()
method is called to build
this SQL statement list, then every thing is ready.public void setSqltext(String sqltext)
sqlfilename
is specified before this method, the parser will using
the SQL text in this field instead of read SQL from sqlfilename
.sqltext
- the input SQL textpublic String getSqltext()
public void setSqlfilename(String sqlfilename)
sqlfilename
- the SQL file name from which the input SQL will be readpublic String getSqlfilename()
public void setSqlInputStream(InputStream sqlInputStream)
sqlfilename
and sqltext
will be ignored.sqlInputStream
- the InputStream from which SQL will be readpublic InputStream getSqlInputStream()
public ArrayList<TSyntaxError> getSyntaxErrors()
TSyntaxError
public EDbVendor getDbVendor()
public void setEnablePartialParsing(boolean enablePartialParsing)
Available to parse sybase stored procedure currently.
enablePartialParsing
- set true to enable this partial parsing, default is falsepublic static String getUserName()
public static String getMachineId()
public static String getLicenseMessage()
public static String getLicenseType()
public TCustomLexer getFlexer()
public void setSqlStatementHandle(ISQLStatementHandle sqlStatementHandle)
public void setTokenHandle(ITokenHandle tokenHandle)
tokenHandle
- the event handler to process the new created source tokenpublic void setTokenListHandle(ITokenListHandle tokenListHandle)
public void setMetaDatabase(IMetaDatabase metaDatabase)
getSqlEnv()
instead
set an instance of a class which implement the interface: IMetaDatabase
.
The parser will call IMetaDatabase.checkColumn(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.lang.String)
method when it needs to know
whether a column is belonged to a table.
The class that implements the interface: IMetaDatabase
usually fetch the metadata from the database
by connecting to a database instance.
If the class is not provided, the parser has to guess the relationship between a un-qualified column and table
in the input SQL which may lead to a un-determined result between the column and table.metaDatabase
- a new instance of the class which implements the IMetaDatabase
interfaceIMetaDatabase
public IMetaDatabase getMetaDatabase()
getSqlEnv()
instead
a new instance of the class which implements the IMetaDatabase
interfaceIMetaDatabase
interfacesetMetaDatabase(gudusoft.gsqlparser.IMetaDatabase)
public void freeParseTable()
public TSelectSqlStatement parseSubquery(String subquery)
subquery
- a plain text select statement which need to be converted to a TSelectSqlStatement
objectpublic static TSelectSqlStatement parseSubquery(EDbVendor dbVendor, String subquery)
public TExpression parseExpression(String expr)
expr
- a plain text expression which will be converted to TExpression
objectpublic static TExpression parseExpression(EDbVendor dbVendor, String expr)
public TFunctionCall parseFunctionCall(String newFunction)
newFunction
- a plain text function which will be converted to TFunctionCall
objectpublic static TFunctionCall parseFunctionCall(EDbVendor dbVendor, String newFunction)
public TObjectName parseObjectName(String newObjectName)
newObjectName
- a plain text objectName which will be converted to TObjectName
objectpublic static TObjectName parseObjectName(EDbVendor dbVendor, String newObjectName)
dbVendor
- newObjectName
- public TConstant parseConstant(String newConstant)
newConstant
- a plian text constant which will be converted to TConstant
objectpublic static TConstant parseConstant(EDbVendor dbVendor, String newConstant)
public int getErrorCount()
public String getErrormessage()
getSyntaxErrors()
.
User may generate error message in their own format by iterating all items in getSyntaxErrors()
.public int checkSyntax()
parse()
method.getErrorCount()
, getErrormessage()
to get detailed error information.parse()
public int parse()
getErrormessage()
method.
The syntax error in one SQL statement doesn't prevent the parser continue to check the syntax of the next SQL statement.
After checking syntax of all SQL statements, use the getErrorCount()
method to get the total number of errors.
A syntax error in a SQL stored procedure will cease this parser to check syntax of the rest SQL statements
in this stored procedure.getErrorCount()
, getErrormessage()
to get detailed error information.getSyntaxErrors()
public int validate()
public void setSinglePLBlock(boolean singlePLBlock)
public int getrawsqlstatements()
getSqlstatements()
method to get the list of SQL statements.
The SQL statement object is the instance of the sub-class of TCustomSqlStatement
, get SQL statement type
via the TCustomSqlStatement.sqlstatementtype
field, get string representation of
each SQL statement via the TParseTreeNode.toString()
method.
All source tokens in this SQL statement
is available by using TCustomSqlStatement.sourcetokenlist
filed.
Since no parse tree is built by calling this method, no further detailed information about the SQL statement is available.public void tokenizeSqltext()
getSourcetokenlist()
method.public TSQLEnv getSqlEnv()
TSQLEnv
to TGSqlParser.
this class tells TGSqlParser the relationship between column and table.
Take this SQL for example:
SELECT Quantity,b.Time,c.Description FROM (SELECT ID2,Time FROM bTab) b INNER JOIN aTab a on a.ID=b.ID INNER JOIN cTab c on a.ID=c.ID
General SQL Parser can build relationship between column: ID2 and table: bTable correctly without metadata information from database because there is only one table in from clause. But it can't judge column: Quantity belong to table: aTab or cTab, since no table alias was prefixed to column: Quantity. If no metadata provided, General SQL Parser will link column: Quantity to the first valid table (here it is aTab)
If we create a class TRealDatabaseSQLEnv implements TSQLEnv
,then
setSqlEnv(TSQLEnv)
, General SQL Parser can take this advantage to create
a correct relationship between column and tables.
class TSQLServerEnv extends TSQLEnv{ public TSQLServerEnv(){ super(EDbVendor.dbvmssql); initSQLEnv(); } @Override public void initSQLEnv() { // add a new database: master TSQLCatalog sqlCatalog = createSQLCatalog("master"); // add a new schema: dbo TSQLSchema sqlSchema = sqlCatalog.createSchema("dbo"); //add a new table: aTab TSQLTable aTab = sqlSchema.createTable("aTab"); aTab.addColumn("Quantity1"); //add a new table: bTab TSQLTable bTab = sqlSchema.createTable("bTab"); bTab.addColumn("Quantity2"); //add a new table: cTab TSQLTable cTab = sqlSchema.createTable("cTab"); cTab.addColumn("Quantity"); } }
public void setOnlyNeedRawParseTree(boolean onlyNeedRawParseTree)
public void setEnableTimeLogging(boolean enable)
enable
- true to enable time logging, false to disablepublic boolean isTimeLoggingEnabled()
public void resetTimeCounters()
public long getRawSqlStatementsTime()
public long getParsingTime()
public long getSemanticAnalysisTime()
public long getInterpreterTime()
public long getTotalTime()
public String getTimeStatistics()
public void setTeradataUtilityType(TeradataUtilityType teradataUtilityType)