Class ColumnSource

Object
gudusoft.gsqlparser.resolver2.model.ColumnSource

public class ColumnSource extends Object
Represents the source of a column reference. Tracks where a column comes from, including intermediate transformations through subqueries and CTEs. Design principles: 1. Immutable - once created, cannot be modified 2. Recursive - can trace back through subquery/CTE layers 3. Confidence-scored - supports evidence-based inference
  • Constructor Details

  • Method Details

    • getSourceNamespace

    • getExposedName

    • getDefinitionNode

    • getDefinitionLocation

    • getConfidence

      public double getConfidence()
    • getEvidence

      public String getEvidence()
    • getEvidenceDetail

      Get the structured evidence detail for this resolution.

      This is the preferred way to access resolution evidence as it provides:

      • Type-safe evidence type (enum)
      • Confidence weight with clear semantics
      • Source location for traceability
      • Human-readable messages
      Returns:
      The structured evidence detail, or null if not available
    • getEvidenceType

      Get the evidence type from the structured evidence detail. Convenience method for common use cases.
      Returns:
      The evidence type, or null if no evidence detail
    • hasDefiniteEvidence

      public boolean hasDefiniteEvidence()
      Check if this resolution has definite evidence (not inferred). Definite evidence comes from DDL, metadata, or explicit definitions.
      Returns:
      true if evidence is definite
    • getFinalTable

      Get the final physical table this column originates from after tracing through all subqueries and CTEs.

      Semantic Difference: getFinalTable() vs TObjectName.getSourceTable()

      • getFinalTable() (this method): The final physical table after recursively tracing through all subqueries and CTEs. Use this for data lineage.
      • TObjectName.getSourceTable(): The immediate source in the current scope. For a column from a subquery, this points to the subquery's TTable itself.

      Example

      
       SELECT title FROM (SELECT * FROM books) sub
      
       For the 'title' column in outer SELECT:
       - TObjectName.getSourceTable() → TTable for subquery 'sub' (immediate source)
       - ColumnSource.getFinalTable() → TTable for 'books' (final physical table)
       

      For calculated columns in subqueries (expressions like START_DT - x AS alias), this returns null because such calculated columns don't originate from a physical table - they are derived values computed in the subquery.

      Note: For CTEs, calculated columns ARE the CTE's own columns, so they trace to the CTE itself (handled by CTENamespace.getFinalTable()).

      Returns:
      The physical table, or null if unable to determine or if calculated in subquery
      See Also:
    • getAllFinalTables

      Get all physical tables that this column might originate from.

      For columns from UNION queries, this returns tables from ALL branches, not just the first one. This is essential for proper lineage tracking where a column like actor_id in a UNION query should be linked to actor.actor_id, actor2.actor_id, actor3.actor_id.

      For regular single-table sources, this returns a single-element list with the same table as getFinalTable().

      Returns:
      List of all physical tables, or empty list if unable to determine
    • isCalculatedColumn

      public boolean isCalculatedColumn()
      Check if this column source represents a calculated expression.

      A column is calculated if its definition is a TResultColumn with a non-simple expression (not a direct column reference or star).

      For inferred columns (via star expansion), we trace back to the source CTE/subquery to check if the original column is calculated.

      Returns:
      true if this is a calculated column
    • isColumnAlias

      public boolean isColumnAlias()
      Check if this column source represents a column alias (renamed column).

      A column is an alias if it's a simple column reference in a subquery that has been given a different name via AS or NAMED. For example:

      • SELECT col AS alias FROM table - alias is different from col
      • SELECT col (NAMED alias) FROM table - Teradata NAMED syntax
      • SELECT alias = col FROM table - SQL Server proprietary syntax

      Column aliases should NOT trace to base tables because the alias name doesn't exist as an actual column in the base table.

      Returns:
      true if this is a column alias with a different name than the original
    • isCTEExplicitColumn

      public boolean isCTEExplicitColumn()
      Check if this column is a CTE explicit column with a different name than the underlying column.

      A CTE explicit column is one defined in the CTE's column list that maps to a different column name in the CTE's SELECT list. For example:

       WITH cte(c1, c2) AS (SELECT id, name FROM users)
       SELECT c1 FROM cte  -- c1 maps to 'id', names differ
       

      CTE explicit columns should NOT trace to base tables because the explicit column name (c1) doesn't exist as an actual column in the base table (users).

      Returns:
      true if this is a CTE explicit column with a different name
    • getOverrideTable

      Get the override table, if set.
    • getCandidateTables

      Get the candidate tables for ambiguous columns.

      When a column could come from multiple tables (e.g., SELECT * FROM t1, t2), this returns all possible source tables. End users can iterate through this list to understand all potential sources for the column.

      Returns:
      List of candidate tables, or empty list if not ambiguous
    • isAmbiguous

      public boolean isAmbiguous()
      Check if this column has multiple candidate tables (is ambiguous).
      Returns:
      true if there are multiple candidate tables
    • getFieldPath

      Get the field path for deep/record field access.

      When a column reference includes field access beyond the base column, this returns the field path. For example, in customer.address.city, if base column is customer, this returns a FieldPath with segments ["address", "city"].

      Returns:
      The field path, or null if no field access
    • hasFieldPath

      public boolean hasFieldPath()
      Check if this column source has a field path (deep/record field access).
      Returns:
      true if a non-empty field path exists
    • isStructFieldAccess

      public boolean isStructFieldAccess()
      Check if this is a struct field access (has evidence "struct_field_access").

      This is a convenience method for checking if this column source represents a struct/record field dereference operation.

      Returns:
      true if this is a struct field access
    • isDefinite

      public boolean isDefinite()
      Checks if this is a definite resolution (confidence = 1.0)
    • isInferred

      public boolean isInferred()
      Checks if this is an inferred resolution (confidence < 1.0)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • withConfidence

      public ColumnSource withConfidence(double newConfidence, String newEvidence)
      Deprecated.
      Creates a copy with updated confidence and evidence. Used when merging or updating inference results.
    • withEvidence

      Creates a copy with updated ResolutionEvidence. This is the preferred method for updating evidence in new code.
      Parameters:
      newEvidence - The new evidence detail
      Returns:
      A new ColumnSource with updated evidence
    • withCandidateTables

      Creates a copy with candidate tables. Used when a column could come from multiple tables.
    • withFieldPath

      public ColumnSource withFieldPath(FieldPath newFieldPath)
      Creates a copy with a field path for deep/record field access.

      This method is used when resolving struct/record field access patterns like customer.address.city. The base column is preserved as the exposedName, and the field path captures the remaining segments.

      Parameters:
      newFieldPath - The field path segments (beyond the base column)
      Returns:
      A new ColumnSource with the field path set
    • withFieldPath

      public ColumnSource withFieldPath(List<String> segments)
      Creates a copy with a field path from a list of segments.

      Convenience method for creating a ColumnSource with a field path from a list of string segments.

      Parameters:
      segments - The field path segments
      Returns:
      A new ColumnSource with the field path set
    • withFieldPath

      public ColumnSource withFieldPath(FieldPath newFieldPath, String newEvidence)
      Creates a copy with field path and updated evidence.

      This method is used when resolving struct field access, combining both the field path and the struct_field_access evidence marker.

      Parameters:
      newFieldPath - The field path segments
      newEvidence - The evidence string (e.g., "struct_field_access")
      Returns:
      A new ColumnSource with field path and evidence updated
    • isColumnInTableDdl

      public static boolean isColumnInTableDdl(TTable table, String columnName)
      Check if a column exists in a table's DDL definition.

      This method checks the table's column definitions (from CREATE TABLE statements parsed in the same script) to verify if the column name is defined.

      Parameters:
      table - The table to check
      columnName - The column name to look for
      Returns:
      true if the column exists in the table's DDL, false if not found or no DDL available
    • hasTableDdl

      public static boolean hasTableDdl(TTable table)
      Check if a table has DDL metadata available (from CREATE TABLE in same script).
      Parameters:
      table - The table to check
      Returns:
      true if DDL metadata is available for this table
    • getDdlVerificationStatus

      public static int getDdlVerificationStatus(TTable table, String columnName)
      Check DDL verification status for a candidate table.

      Returns a tri-state result:

      • 1 = Column exists in table's DDL
      • 0 = Column NOT found in table's DDL (DDL available but column missing)
      • -1 = Cannot verify (no DDL available for this table)
      Parameters:
      table - The candidate table to check
      columnName - The column name to verify
      Returns:
      DDL verification status: 1 (exists), 0 (not found), -1 (no DDL)
    • getCandidateTableDdlStatus

      Get DDL verification status for all candidate tables.

      Returns a map from each candidate table to its DDL verification status:

      • 1 = Column exists in table's DDL
      • 0 = Column NOT found in table's DDL
      • -1 = Cannot verify (no DDL available)
      Returns:
      Map of candidate tables to their DDL verification status, or empty map if no candidates