SQL Adapter
Extract, validate, and transpile SQL queries from LLM responses into execution-ready formats.
SQL Adapter ensures that LLM-generated SQL statements are clean, syntactically valid, stripped of comments, and transpiled to the correct target SQL dialect. It processes free-form text, handles markdown code fences, merges contiguous statements, and handles dialect-specific differences.
All extraction and comment-stripping behaviors are always active.
Stage
| Stage | Behavior |
|---|---|
| POST only | Processes the LLM response: extracts the first valid SQL statement(s), strips inline/block comments, transpiles to the configured target dialect, and formats the output compactly |
What It Does
The module applies the following steps to every LLM response, in order:
- Extraction:
- Searches for markdown code fences (e.g.
```sqlor```) containing valid SQL statements. - If no fences are present, scans the raw response text for plain SQL queries.
- Parses the statements using an intelligent auto-detecting parser that checks the configured target dialect and common SQL dialects.
- If multiple distinct SQL blocks are found, it raises a validation exception to prevent ambiguity.
- Searches for markdown code fences (e.g.
- Comment Stripping: Recursively strips all inline comments (
-- comment) and block comments (/* comment */) from the AST. - Transpilation: Transpiles and formats the SQL statements compactly for the selected Target SQL Dialect.
- Validation: Appends a trailing semicolon to the query and returns the normalized string along with execution statistics (such as the detected source dialect and statement count).
Example
Raw LLM response:
Here is the SQL query to get the top 10 highest-paid employees:
```sql
-- Get highest earners
SELECT TOP 10 *
FROM employees
ORDER BY salary DESC; /* Sort by salary descending */
```After normalization (Target Dialect: PostgreSQL):
SELECT * FROM employees ORDER BY salary DESC NULLS LAST LIMIT 10;Configuration
Target SQL Dialect
The database SQL dialect to transpile the query to. If not specified, it defaults to PostgreSQL.
The module supports the following 31 target SQL dialects:
- Athena (
athena) - BigQuery (
bigquery) - ClickHouse (
clickhouse) - Databricks (
databricks) - Doris (
doris) - Dremio (
dremio) - Drill (
drill) - Druid (
druid) - DuckDB (
duckdb) - Dune (
dune) - Exasol (
exasol) - Fabric (
fabric) - Hive (
hive) - Materialize (
materialize) - MySQL (
mysql) - Oracle (
oracle) - PostgreSQL (
postgres, default) - Presto (
presto) - PRQL (
prql) - Redshift (
redshift) - RisingWave (
risingwave) - Snowflake (
snowflake) - Solr (
solr) - Spark SQL (
spark) - Spark 2 (
spark2) - SQLite (
sqlite) - StarRocks (
starrocks) - Tableau (
tableau) - Teradata (
teradata) - Trino (
trino) - T-SQL / SQL Server (
tsql)
Failure Behavior
If the LLM response contains no SQL queries, contains multiple distinct SQL blocks, or contains invalid SQL that cannot be parsed by any supported dialect, the module raises a pipeline exception immediately.
This ensures that downstream execution systems never run malformed or invalid queries against your databases.
Observability & Fallbacks
The module logs warning-level entries if it has to fall back and guess a source dialect (e.g., if the LLM generated SQL Server syntax but target dialect was PostgreSQL). The detected source dialect is returned in the execution statistics as source_dialect for auditing and prompt tuning.
Example Use Cases
- AI-driven BI & Analytics: Safely extract user-facing queries generated by LLMs and run them on target analytics databases.
- Dialect Portability: Transpile generic or dialect-specific SQL (e.g., T-SQL, MySQL) generated by models into database-specific syntax (e.g. Postgres, ClickHouse).
- Security & Comment Stripping: Automatically remove inline and block comments from LLM-generated code before executing it.
- Strict Single-Query Enforcement: Prevent SQL injection or multi-query execution risks by validating that exactly one SQL query block is returned.