Class IdentifierRules
定义数据库厂商的标识符大小写规则,区分 quoted 和 unquoted 标识符的处理方式。
设计来源:dbobject_search.md 资深设计师方案
使用示例:
// Oracle: unquoted 折叠为大写且不敏感,quoted 保留原样且敏感 IdentifierRules oracleRules = IdentifierRules.forOracle(); // ClickHouse: 全部大小写敏感 IdentifierRules clickhouseRules = IdentifierRules.forClickHouse();
- Since:
- 3.1.0.9
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enum大小写比较规则(Case Comparison)static enum大小写折叠规则(Case Folding) -
Field Summary
FieldsModifier and TypeFieldDescriptionQuoted 标识符的大小写比较规则final IdentifierRules.CaseFoldQuoted 标识符的大小写折叠规则(通常为 NONE,保留原样)Unquoted 标识符的大小写比较规则final IdentifierRules.CaseFoldUnquoted 标识符的大小写折叠规则 -
Constructor Summary
ConstructorsConstructorDescriptionIdentifierRules(IdentifierRules.CaseFold unquotedFold, IdentifierRules.CaseCompare unquotedCompare, IdentifierRules.CaseFold quotedFold, IdentifierRules.CaseCompare quotedCompare) 构造标识符规则 -
Method Summary
Modifier and TypeMethodDescriptionstatic IdentifierRulesAthena 标识符规则(与 Presto 相同)static IdentifierRulesBigQuery 列名规则(大小写不敏感)static IdentifierRulesBigQuery 表名规则(大小写敏感)static IdentifierRulesClickHouse 标识符规则static IdentifierRulesCouchbase N1QL 标识符规则static IdentifierRulesDatabricks 标识符规则(与 Hive 相同)static IdentifierRulesforDB2()DB2 / Netezza / Exasol 标识符规则(与 Oracle 相同)static IdentifierRulesGaussDB 标识符规则(与 PostgreSQL 相同)static IdentifierRules通用规则(默认:与 PostgreSQL 相同)static IdentifierRulesforHANA()SAP HANA 标识符规则(与 Oracle 相同)static IdentifierRulesforHive()Hive / SparkSQL / Impala 标识符规则(与 PostgreSQL 相同)static IdentifierRulesforMySQL(int lowerCaseTableNames) MySQL 标识符规则(table/schema names)static IdentifierRulesMySQL 列名规则(始终大小写不敏感)static IdentifierRulesMySQL 函数名规则(始终大小写不敏感)static IdentifierRulesOracle 标识符规则static IdentifierRulesPostgreSQL / Redshift / Greenplum 标识符规则static IdentifierRulesPresto / Trino 标识符规则static IdentifierRulesSnowflake 标识符规则(与 Oracle 相同)static IdentifierRulesSQL Server / Azure SQL 标识符规则static IdentifierRulesTeradata 标识符规则(与 PostgreSQL 相同)static IdentifierRulesVertica 标识符规则(与 Presto 相同)toString()
-
Field Details
-
unquotedFold
Unquoted 标识符的大小写折叠规则 -
unquotedCompare
Unquoted 标识符的大小写比较规则 -
quotedFold
Quoted 标识符的大小写折叠规则(通常为 NONE,保留原样) -
quotedCompare
Quoted 标识符的大小写比较规则
-
-
Constructor Details
-
IdentifierRules
public IdentifierRules(IdentifierRules.CaseFold unquotedFold, IdentifierRules.CaseCompare unquotedCompare, IdentifierRules.CaseFold quotedFold, IdentifierRules.CaseCompare quotedCompare) 构造标识符规则- Parameters:
unquotedFold- unquoted 标识符的折叠规则unquotedCompare- unquoted 标识符的比较规则quotedFold- quoted 标识符的折叠规则quotedCompare- quoted 标识符的比较规则
-
-
Method Details
-
forOracle
Oracle 标识符规则实际数据库行为(Oracle 12c+):
- Unquoted: 折叠为大写,比较不敏感 (CREATE TABLE foo → stored as FOO, foo=FOO=Foo)
- Quoted: 保留原样,比较敏感 (CREATE TABLE "foo" → stored as foo, "foo"!="FOO")
与 Legacy TSQLEnv 兼容性:
- columnCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to UPPER) - functionCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to UPPER) - tableCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to UPPER) - catalogCollationCaseSensitive =
false→ ✅ COMPATIBLE (both fold to UPPER)
测试用例影响:
- ✅ 新规则正确:Oracle 确实将 unquoted identifiers 折叠为大写
- ⚠️ 如果旧测试期望保留原始大小写 (如 "myTable" 保持为 "myTable"),则测试会失败
- ⚠️ 应更新测试期望为大写 (如 "myTable" → "MYTABLE")
IdentifierRules 配置:
- Unquoted: 折叠为大写 (
IdentifierRules.CaseFold.UPPER), 比较不敏感 (IdentifierRules.CaseCompare.INSENSITIVE) - Quoted: 保留原样 (
IdentifierRules.CaseFold.NONE), 比较敏感 (IdentifierRules.CaseCompare.SENSITIVE)
-
forPostgreSQL
PostgreSQL / Redshift / Greenplum 标识符规则实际数据库行为(PostgreSQL 12+):
- Unquoted: 折叠为小写,比较不敏感 (CREATE TABLE MyTable → stored as mytable, MyTable=mytable=MYTABLE)
- Quoted: 保留原样,比较敏感 (CREATE TABLE "MyTable" → stored as MyTable, "MyTable"!="mytable")
与 Legacy TSQLEnv 兼容性:
- columnCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - functionCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to LOWER) - tableCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to LOWER) - catalogCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to LOWER)
测试用例影响:
- ✅ 新规则正确:PostgreSQL 确实将 unquoted identifiers 折叠为小写
- ⚠️ 如果旧测试期望大写 (如 "MyTable" → "MYTABLE"),则测试会失败
- ⚠️ 应更新测试期望为小写 (如 "MyTable" → "mytable")
- ⚠️ 如果旧测试期望保留原始大小写,也会失败
IdentifierRules 配置:
- Unquoted: 折叠为小写 (
IdentifierRules.CaseFold.LOWER), 比较不敏感 (IdentifierRules.CaseCompare.INSENSITIVE) - Quoted: 保留原样 (
IdentifierRules.CaseFold.NONE), 比较敏感 (IdentifierRules.CaseCompare.SENSITIVE)
-
forClickHouse
ClickHouse 标识符规则实际数据库行为(ClickHouse 20+):
- Unquoted: 保留原样,比较敏感 (CREATE TABLE MyTable → stored as MyTable, MyTable!=mytable)
- Quoted: 保留原样,比较敏感 (CREATE TABLE `MyTable` → stored as MyTable, MyTable!=mytable)
与 Legacy TSQLEnv 兼容性:
- columnCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case) - functionCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case) - tableCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case) - catalogCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case)
测试用例影响:
- ✅ 新规则正确:ClickHouse 是完全大小写敏感的数据库
- ⚠️ 如果旧测试期望 "MyTable" → "MYTABLE",则测试会失败
- ⚠️ 应更新测试期望为保留原样 (如 "MyTable" 保持为 "MyTable")
- ⚠️ 旧代码可能错误地匹配了不同大小写的标识符 (foo 匹配 FOO),新代码会正确拒绝
IdentifierRules 配置:
- Unquoted: 保留原样 (
IdentifierRules.CaseFold.NONE), 比较敏感 (IdentifierRules.CaseCompare.SENSITIVE) - Quoted: 保留原样 (
IdentifierRules.CaseFold.NONE), 比较敏感 (IdentifierRules.CaseCompare.SENSITIVE)
-
forCouchbase
Couchbase N1QL 标识符规则实际数据库行为(Couchbase N1QL):
- Unquoted: 保留原样,比较敏感 (与 ClickHouse 相同)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case) - 参见
forClickHouse()的详细说明
-
forSQLServer
SQL Server / Azure SQL 标识符规则实际数据库行为(SQL Server 2019+):
- Unquoted: 保留原样,比较由 collation 决定 (CREATE TABLE MyTable → stored as MyTable)
- Quoted: 保留原样,比较由 collation 决定 (CREATE TABLE [MyTable] → stored as MyTable)
- 默认 collation (SQL_Latin1_General_CP1_CI_AS): 大小写不敏感 (MyTable=mytable=MYTABLE)
- CS collation (SQL_Latin1_General_CP1_CS_AS): 大小写敏感 (MyTable!=mytable)
与 Legacy TSQLEnv 兼容性:
- columnCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case) - functionCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case) - tableCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case) - catalogCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case)
测试用例影响:
- ✅ 新规则正确:SQL Server 保留标识符原始大小写,使用 collation 进行比较
- ⚠️ 如果旧测试期望 "MyTable" → "MYTABLE",则测试会失败
- ⚠️ 应更新测试期望为保留原样 (如 "MyTable" 保持为 "MyTable")
- ⚠️ 这是导致 dataflow 测试 processId 变化的根本原因!
- 📝 参考: investigation_findings_2025_10_20.md
IdentifierRules 配置:
- Unquoted: 不折叠 (
IdentifierRules.CaseFold.NONE), 基于 collation 比较 (IdentifierRules.CaseCompare.COLLATION_BASED) - Quoted: 不折叠 (
IdentifierRules.CaseFold.NONE), 基于 collation 比较 (IdentifierRules.CaseCompare.COLLATION_BASED) - 需要配合
CollatorProvider使用,默认使用 SQL_Latin1_General_CP1_CI_AS (大小写不敏感)
注意:SQL Server 的大小写行为完全由 collation 决定,无法简单折叠。
-
forMySQL
MySQL 标识符规则(table/schema names)实际数据库行为(MySQL 8.0+):
根据
lower_case_table_names系统变量决定:- 0 (Unix/Linux): 大小写敏感,保留原样 (CREATE TABLE MyTable → stored as MyTable, MyTable!=mytable)
- 1 (Windows): 存储为小写,比较不敏感 (CREATE TABLE MyTable → stored as mytable, MyTable=mytable=MYTABLE)
- 2 (macOS): 存储保留原样,比较不敏感 (CREATE TABLE MyTable → stored as MyTable, MyTable=mytable=MYTABLE)
与 Legacy TSQLEnv 兼容性:
- columnCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new doesn't fold or folds to LOWER) - functionCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new doesn't fold) - tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new behavior depends on lower_case_table_names) - catalogCollationCaseSensitive =
true→ ⚠️ PARTIAL (legacy preserved case, new is insensitive)
测试用例影响:
- ✅ 新规则正确:MySQL 的行为确实依赖 lower_case_table_names 设置
- ⚠️ 如果旧测试期望 "MyTable" → "MYTABLE",则在模式 1/2 下测试会失败
- ⚠️ 模式 0: 应期望保留原样 (如 "MyTable" 保持为 "MyTable"),区分大小写
- ⚠️ 模式 1: 应期望小写 (如 "MyTable" → "mytable"),不区分大小写
- ⚠️ 模式 2: 应期望保留原样 (如 "MyTable" 保持为 "MyTable"),不区分大小写
IdentifierRules 配置:
- 模式 0: 不折叠 (
IdentifierRules.CaseFold.NONE), 敏感 (IdentifierRules.CaseCompare.SENSITIVE) - 模式 1: 折叠为小写 (
IdentifierRules.CaseFold.LOWER), 不敏感 (IdentifierRules.CaseCompare.INSENSITIVE) - 模式 2: 不折叠 (
IdentifierRules.CaseFold.NONE), 不敏感 (IdentifierRules.CaseCompare.INSENSITIVE) - Quoted: 保留原样,但也不敏感 (MySQL 特殊行为)
- Parameters:
lowerCaseTableNames-lower_case_table_names值(0, 1, 2)
-
forMySQLColumn
MySQL 列名规则(始终大小写不敏感)实际数据库行为(MySQL 8.0+):
- 列名始终大小写不敏感,不受 lower_case_table_names 影响
- 存储时保留原样,比较时不敏感 (SELECT MyColumn → stored as MyColumn, MyColumn=mycolumn)
与 Legacy TSQLEnv 兼容性:
- columnCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new preserves case but is INSENSITIVE)
-
forMySQLRoutine
MySQL 函数名规则(始终大小写不敏感)实际数据库行为(MySQL 8.0+):
- 函数名/存储过程名始终大小写不敏感
- 存储时保留原样,比较时不敏感 (与列名相同)
与 Legacy TSQLEnv 兼容性:
- functionCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new preserves case but is INSENSITIVE)
-
forBigQueryTable
BigQuery 表名规则(大小写敏感)实际数据库行为(BigQuery Standard SQL):
- Unquoted: 保留原样,比较敏感 (CREATE TABLE MyTable → stored as MyTable, MyTable!=mytable)
- Quoted: 保留原样,比较敏感 (CREATE TABLE `MyTable` → stored as MyTable, MyTable!=mytable)
- 表名/dataset名/project名都是大小写敏感的
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case and is SENSITIVE) - catalogCollationCaseSensitive =
false→ ❌ INCOMPATIBLE (legacy folded to UPPER, new preserves case and is SENSITIVE)
测试用例影响:
- ✅ 新规则正确:BigQuery 表名确实是大小写敏感的
- ⚠️ 如果旧测试期望 "MyTable" → "MYTABLE",则测试会失败
- ⚠️ 应更新测试期望为保留原样 (如 "MyTable" 保持为 "MyTable")
- ⚠️ 旧代码可能错误地匹配了不同大小写的表名,新代码会正确拒绝
-
forBigQueryColumn
BigQuery 列名规则(大小写不敏感)实际数据库行为(BigQuery Standard SQL):
- Unquoted: 保留原样,比较不敏感 (SELECT MyColumn → stored as MyColumn, MyColumn=mycolumn=MYCOLUMN)
- Quoted: 保留原样,比较不敏感 (SELECT `MyColumn` → MyColumn=mycolumn)
- 列名是大小写不敏感的(与表名不同)
与 Legacy TSQLEnv 兼容性:
- columnCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new preserves case but is INSENSITIVE)
测试用例影响:
- ✅ 新规则正确:BigQuery 列名确实是大小写不敏感的
- ⚠️ 如果旧测试期望 "MyColumn" → "MYCOLUMN",则测试会失败
- ⚠️ 应更新测试期望为保留原样 (如 "MyColumn" 保持为 "MyColumn")
- ✅ 但比较时应忽略大小写 (MyColumn = mycolumn = MYCOLUMN)
-
forDB2
DB2 / Netezza / Exasol 标识符规则(与 Oracle 相同)实际数据库行为(DB2 11+):
- Unquoted: 折叠为大写,比较不敏感 (与 Oracle 相同)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- DB2 tableCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to UPPER) - DB2 catalogCollationCaseSensitive =
true→ ❌ INCOMPATIBLE (legacy preserved case, new folds to UPPER) - 参见
forOracle()的详细说明
-
forSnowflake
Snowflake 标识符规则(与 Oracle 相同)实际数据库行为(Snowflake):
- Unquoted: 折叠为大写,比较不敏感 (与 Oracle 相同)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (both fold to UPPER, COMPATIBLE) - columnCollationCaseSensitive =
false→ ⚠️ PARTIAL (both fold to UPPER, COMPATIBLE) - 参见
forOracle()的详细说明
-
forHANA
SAP HANA 标识符规则(与 Oracle 相同)实际数据库行为(SAP HANA):
- Unquoted: 折叠为大写,比较不敏感 (与 Oracle 相同)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (both fold to UPPER, COMPATIBLE) - 参见
forOracle()的详细说明
-
forPresto
Presto / Trino 标识符规则实际数据库行为(Presto/Trino):
- Unquoted: 折叠为小写,比较不敏感 (CREATE TABLE MyTable → stored as mytable)
- Quoted: 保留原样,但与 unquoted 规则一致(比较时仍不敏感)
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - columnCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER)
测试用例影响:
- ✅ 新规则正确:Presto/Trino 折叠为小写,quoted 标识符与 unquoted 规则一致
- ⚠️ 如果旧测试期望 "MyTable" → "MYTABLE",则测试会失败
- ⚠️ 应更新测试期望为小写 (如 "MyTable" → "mytable")
IdentifierRules 配置:
- Unquoted: 折叠为小写 (
IdentifierRules.CaseFold.LOWER), 不敏感 (IdentifierRules.CaseCompare.INSENSITIVE) - Quoted: 保留原样 (
IdentifierRules.CaseFold.NONE), 与 unquoted 一致 (IdentifierRules.CaseCompare.SAME_AS_UNQUOTED)
-
forVertica
Vertica 标识符规则(与 Presto 相同)实际数据库行为(Vertica):
- Unquoted: 折叠为小写,比较不敏感 (与 Presto 相同)
- Quoted: 保留原样,但与 unquoted 规则一致
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - 参见
forPresto()的详细说明
-
forHive
Hive / SparkSQL / Impala 标识符规则(与 PostgreSQL 相同)实际数据库行为(Hive 3+, SparkSQL 3+):
- Unquoted: 折叠为小写,比较不敏感 (与 PostgreSQL 相同)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - 参见
forPostgreSQL()的详细说明
-
forTeradata
Teradata 标识符规则(与 PostgreSQL 相同)实际数据库行为(Teradata 16+):
- Unquoted: 折叠为小写,比较不敏感 (与 PostgreSQL 相同)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - 参见
forPostgreSQL()的详细说明
-
forAthena
Athena 标识符规则(与 Presto 相同)实际数据库行为(AWS Athena):
- Unquoted: 折叠为小写,比较不敏感 (与 Presto 相同,基于 Trino/Presto)
- Quoted: 保留原样,但与 unquoted 规则一致
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - 参见
forPresto()的详细说明
-
forGaussDB
GaussDB 标识符规则(与 PostgreSQL 相同)实际数据库行为(华为 GaussDB):
- Unquoted: 折叠为小写,比较不敏感 (与 PostgreSQL 相同,基于 PostgreSQL)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - 参见
forPostgreSQL()的详细说明
-
forDatabricks
Databricks 标识符规则(与 Hive 相同)实际数据库行为(Databricks SQL):
- Unquoted: 折叠为小写,比较不敏感 (与 Hive/SparkSQL 相同)
- Quoted: 保留原样,比较敏感
与 Legacy TSQLEnv 兼容性:
- tableCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - 参见
forHive()和forPostgreSQL()的详细说明
-
forGeneric
通用规则(默认:与 PostgreSQL 相同)说明:
- 当数据库类型未知或不在支持列表时使用此规则
- 默认采用 PostgreSQL 的行为(折叠为小写,比较不敏感)
与 Legacy TSQLEnv 兼容性:
- defaultCollationCaseSensitive =
false→ ⚠️ PARTIAL (legacy folded to UPPER, new folds to LOWER) - 参见
forPostgreSQL()的详细说明
-
toString
-