Skip to content

Apache Hive Keyword Compatibility Reference

Generated for GSP Java version 4.1.0.8 on 2026-03-15

This page was generated using hybrid static extraction from parser source files combined with runtime validation against the actual GSP parser. Re-run the extraction script after parser updates to keep this page current.

Keyword-as-Column-Name Support

As of version 4.1.0.8, the GSP Apache Hive parser includes a lexer lookahead mechanism that allows 17 vendor-unreserved keywords to be used as unquoted column names in SELECT statements.

The lookahead pre-scans the token list before parsing and converts context-specific keywords to identifiers when they appear in column-name position:

  • After: SELECT, ,, DISTINCT, or ALL
  • Before: FROM, AS, WHERE, GROUP, ORDER, HAVING, LIMIT, UNION, INTERSECT, EXCEPT, INTO, ,, ), or ;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Works: keyword as column name
SELECT array FROM t;

-- Works: keyword as column name
SELECT bigint FROM t;

-- Works: keyword as column name
SELECT binary FROM t;

-- Original keyword syntax also still works

Full Classification Overview

Out of 335 keywords recognized by the GSP Apache Hive parser:

Classification Count Description
Allowed 317 Can be used as an unquoted column name in both canonical contexts
Context-specific 17 Fails as SELECT keyword FROM t but works as SELECT t.keyword FROM t
Blocked 1 Cannot be used as an unquoted column name in either context

Context-Specific Keywords (17)

These keywords fail when used as bare column names (SELECT keyword FROM t) but succeed when table-qualified (SELECT t.keyword FROM t).

Keyword Reason
ARRAY Type keyword
BIGINT Type keyword
BINARY Type keyword
BOOLEAN Type keyword
CASE Expression keyword
CAST Expression keyword
DOUBLE Type keyword
EXTRACT Expression keyword
FLOAT Type keyword
FLOOR Grammar keyword
GROUPING Grammar keyword
INT Type keyword
INTERVAL Type keyword
NOT Operator keyword
SMALLINT Type keyword
TIMESTAMP Type keyword
TIMESTAMPLOCALTZ Grammar keyword

Blocked Keywords (1)

These keywords cannot be used as unquoted column names in either context.

Keyword Workaround
SET SELECT "set" FROM t

Workaround: Double-Quoted Identifiers

For any keyword that fails as an unquoted column name, you can use double-quoted identifiers:

1
2
3
4
5
-- Blocked or context-specific keyword as column name
SELECT "set" FROM t;

-- Or use table qualification for context-specific keywords
SELECT t.array FROM t;

Scope and Limitations

  • Tested contexts: SELECT keyword FROM t and SELECT t.keyword FROM t. Other contexts (DDL column definitions, INSERT column lists, aliases) may behave differently.
  • Version-specific: This report reflects GSP Java version 4.1.0.8.
  • Case sensitivity: Keywords are case-insensitive. select, SELECT, and Select are all treated the same.

How to Report Discrepancies

If you encounter a keyword that behaves differently from what this page describes, please report it through your support channel. Include:

  1. The exact SQL statement
  2. The GSP parser version
  3. Whether the same SQL works in Apache Hive

Methodology

  1. Static extraction: A Python script parses the lexer (.cod) and grammar (.y) source files to identify all 335 keywords and their grammar classifications.
  2. Runtime validation: A Java test harness validates every classification against actual TGSqlParser runtime behavior.
  3. JSON dataset: The authoritative data is stored in docs/generated/hive_keyword_compatibility.json.