Skip to content

Apache Spark SQL Keyword Compatibility Reference

Generated for GSP Java version 4.1.0.8 on 2026-03-15

This page was generated using hybrid static extraction from parser source files combined with runtime validation against the actual GSP parser. Re-run the extraction script after parser updates to keep this page current.

Keyword-as-Column-Name Support

As of version 4.1.0.8, the GSP Apache Spark SQL parser includes a lexer lookahead mechanism that allows 14 vendor-unreserved keywords to be used as unquoted column names in SELECT statements.

The lookahead pre-scans the token list before parsing and converts context-specific keywords to identifiers when they appear in column-name position:

  • After: SELECT, ,, DISTINCT, or ALL
  • Before: FROM, AS, WHERE, GROUP, ORDER, HAVING, LIMIT, UNION, INTERSECT, EXCEPT, INTO, ,, ), or ;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Works: keyword as column name
SELECT between FROM t;

-- Works: keyword as column name
SELECT case FROM t;

-- Works: keyword as column name
SELECT exists FROM t;

-- Original keyword syntax also still works

Full Classification Overview

Out of 330 keywords recognized by the GSP Apache Spark SQL parser:

Classification Count Description
Allowed 313 Can be used as an unquoted column name in both canonical contexts
Context-specific 16 Fails as SELECT keyword FROM t but works as SELECT t.keyword FROM t
Blocked 1 Cannot be used as an unquoted column name in either context

Context-Specific Keywords (16)

These keywords fail when used as bare column names (SELECT keyword FROM t) but succeed when table-qualified (SELECT t.keyword FROM t).

Keyword Reason
ALL SELECT qualifier
BETWEEN Operator keyword
CASE Expression keyword
DISTINCT SELECT qualifier
EXISTS Operator keyword
EXTEND Grammar keyword
IF Expression keyword
INSERT Grammar keyword
INTERVAL Type keyword
IS Operator keyword
JOIN JOIN keyword
LIKE Operator keyword
NOT Operator keyword
RIGHT JOIN keyword
ROW Grammar keyword
TRANSFORM Grammar keyword

Blocked Keywords (1)

These keywords cannot be used as unquoted column names in either context.

Keyword Workaround
FROM SELECT "from" FROM t

Workaround: Double-Quoted Identifiers

For any keyword that fails as an unquoted column name, you can use double-quoted identifiers:

1
2
3
4
5
-- Blocked or context-specific keyword as column name
SELECT "from" FROM t;

-- Or use table qualification for context-specific keywords
SELECT t.all FROM t;

Scope and Limitations

  • Tested contexts: SELECT keyword FROM t and SELECT t.keyword FROM t. Other contexts (DDL column definitions, INSERT column lists, aliases) may behave differently.
  • Version-specific: This report reflects GSP Java version 4.1.0.8.
  • Case sensitivity: Keywords are case-insensitive. select, SELECT, and Select are all treated the same.

How to Report Discrepancies

If you encounter a keyword that behaves differently from what this page describes, please report it through your support channel. Include:

  1. The exact SQL statement
  2. The GSP parser version
  3. Whether the same SQL works in Apache Spark SQL

Methodology

  1. Static extraction: A Python script parses the lexer (.cod) and grammar (.y) source files to identify all 330 keywords and their grammar classifications.
  2. Runtime validation: A Java test harness validates every classification against actual TGSqlParser runtime behavior.
  3. JSON dataset: The authoritative data is stored in docs/generated/sparksql_keyword_compatibility.json.