Skip to content

Databricks Keyword Compatibility Reference

Generated for GSP Java version 4.1.0.8 on 2026-03-15

This page was generated using hybrid static extraction from parser source files combined with runtime validation against the actual GSP parser. Re-run the extraction script after parser updates to keep this page current.

Keyword-as-Column-Name Support

As of version 4.1.0.8, the GSP Databricks parser includes a lexer lookahead mechanism that allows 47 vendor-unreserved keywords to be used as unquoted column names in SELECT statements.

The lookahead pre-scans the token list before parsing and converts context-specific keywords to identifiers when they appear in column-name position:

  • After: SELECT, ,, DISTINCT, or ALL
  • Before: FROM, AS, WHERE, GROUP, ORDER, HAVING, LIMIT, UNION, INTERSECT, EXCEPT, INTO, ,, ), or ;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Works: keyword as column name
SELECT array FROM t;

-- Works: keyword as column name
SELECT bigint FROM t;

-- Works: keyword as column name
SELECT binary FROM t;

-- Original keyword syntax also still works

Full Classification Overview

Out of 585 keywords recognized by the GSP Databricks parser:

Classification Count Description
Allowed 535 Can be used as an unquoted column name in both canonical contexts
Context-specific 49 Fails as SELECT keyword FROM t but works as SELECT t.keyword FROM t
Blocked 1 Cannot be used as an unquoted column name in either context

Context-Specific Keywords (49)

These keywords fail when used as bare column names (SELECT keyword FROM t) but succeed when table-qualified (SELECT t.keyword FROM t).

Keyword Reason
ALL SELECT qualifier
ARRAY Type keyword
BIGINT Type keyword
BINARY Type keyword
BOOLEAN Type keyword
CASE Expression keyword
CAST Expression keyword
CHAR Type keyword
COALESCE Expression keyword
DEC Type keyword
DECIMAL Type keyword
DISTINCT SELECT qualifier
DOUBLE Type keyword
EXISTS Operator keyword
FLOAT Type keyword
GREATEST Expression keyword
INT Type keyword
INTEGER Type keyword
INTERVAL Type keyword
INTO Clause keyword
LEAST Expression keyword
MAP Grammar keyword
NOT Operator keyword
NTH_VALUE Grammar keyword
NULLIF Expression keyword
NUMERIC Type keyword
OVERLAY Expression keyword
PERCENTILE_CONT Grammar keyword
PERCENTILE_DISC Grammar keyword
REAL Type keyword
ROW Grammar keyword
SMALLINT Type keyword
STRING Type keyword
STRUCT Grammar keyword
SUBSTR Grammar keyword
SUBSTRING Expression keyword
TINYINT Type keyword
TREAT Expression keyword
TRY_CAST Grammar keyword
VARBINARY Type keyword
VARCHAR Type keyword
XMLCONCAT Grammar keyword
XMLELEMENT Grammar keyword
XMLEXISTS Grammar keyword
XMLFOREST Grammar keyword
XMLPARSE Grammar keyword
XMLPI Grammar keyword
XMLROOT Grammar keyword
XMLSERIALIZE Grammar keyword

Blocked Keywords (1)

These keywords cannot be used as unquoted column names in either context.

Keyword Workaround
FROM SELECT "from" FROM t

Workaround: Double-Quoted Identifiers

For any keyword that fails as an unquoted column name, you can use double-quoted identifiers:

1
2
3
4
5
-- Blocked or context-specific keyword as column name
SELECT "from" FROM t;

-- Or use table qualification for context-specific keywords
SELECT t.all FROM t;

Scope and Limitations

  • Tested contexts: SELECT keyword FROM t and SELECT t.keyword FROM t. Other contexts (DDL column definitions, INSERT column lists, aliases) may behave differently.
  • Version-specific: This report reflects GSP Java version 4.1.0.8.
  • Case sensitivity: Keywords are case-insensitive. select, SELECT, and Select are all treated the same.

How to Report Discrepancies

If you encounter a keyword that behaves differently from what this page describes, please report it through your support channel. Include:

  1. The exact SQL statement
  2. The GSP parser version
  3. Whether the same SQL works in Databricks

Methodology

  1. Static extraction: A Python script parses the lexer (.cod) and grammar (.y) source files to identify all 585 keywords and their grammar classifications.
  2. Runtime validation: A Java test harness validates every classification against actual TGSqlParser runtime behavior.
  3. JSON dataset: The authoritative data is stored in docs/generated/databricks_keyword_compatibility.json.