Class InferenceEngine

Object
gudusoft.gsqlparser.resolver2.inference.InferenceEngine

public class InferenceEngine extends Object
Engine for inferring column-to-table relationships without metadata.

The inference engine collects evidence from various sources in the SQL statement and uses it to infer which columns belong to which tables. This is particularly useful when: - Database metadata is not available - Dealing with SELECT * without schema information - Analyzing SQL from unknown sources

Inference process: 1. Collect evidence from SQL statement (WHERE, JOIN, SELECT, etc.) 2. Aggregate evidence by table and column 3. Calculate confidence scores 4. Generate inferred column sources

Example:

 SELECT * FROM employees e
 WHERE e.department_id = 10
   AND e.salary > 50000

 Inference:
 - "department_id" column exists in "employees" (confidence: 0.95)
 - "salary" column exists in "employees" (confidence: 0.95)
 
  • Constructor Details

  • Method Details

    • addEvidence

      public void addEvidence(InferenceEvidence evidence)
      Add a piece of evidence for inference.
      Parameters:
      evidence - the evidence to add
    • addAllEvidence

      public void addAllEvidence(Collection<InferenceEvidence> evidences)
      Add multiple pieces of evidence.
      Parameters:
      evidences - the evidence to add
    • getInferredColumns

      public Set<String> getInferredColumns(String tableName)
      Get all inferred columns for a table.
      Parameters:
      tableName - the table name
      Returns:
      set of inferred column names, or empty set if none
    • getEvidence

      public List<InferenceEvidence> getEvidence(String tableName, String columnName)
      Get all evidence for a specific table.column.
      Parameters:
      tableName - the table name
      columnName - the column name
      Returns:
      list of evidence, or empty list if none
    • calculateConfidence

      public double calculateConfidence(String tableName, String columnName)
      Calculate the combined confidence for a table.column based on all evidence.

      Combines multiple pieces of evidence using formula:

       combined = 1 - ∏(1 - conf_i)
       
      This means: - Multiple pieces of evidence increase confidence - Evidence is independent (multiplicative combination) - Result is always in [0, 1]
      Parameters:
      tableName - the table name
      columnName - the column name
      Returns:
      combined confidence [0.0, 1.0], or 0.0 if no evidence
    • createInferredColumnSource

      public ColumnSource createInferredColumnSource(String tableName, String columnName, TTable table)
      Create an inferred ColumnSource for a table.column.
      Parameters:
      tableName - the table name
      columnName - the column name
      table - the TTable object (may be null if not available)
      Returns:
      ColumnSource with inferred confidence, or null if no evidence
    • getTablesWithInferences

      Get all tables that have inferred columns.
      Returns:
      set of table names with inferred columns
    • getEvidenceCount

      public int getEvidenceCount()
      Get total number of pieces of evidence collected.
      Returns:
      evidence count
    • getInferredColumnCount

      public int getInferredColumnCount()
      Get total number of inferred columns across all tables.
      Returns:
      inferred column count
    • clear

      public void clear()
      Clear all evidence and inferred columns.
    • getStatistics

      Get statistics about the inference engine state.
      Returns:
      summary string
    • toString

      public String toString()
      Overrides:
      toString in class Object