A pure-Go PostgreSQL parser. No cgo, no C toolchain — just go build.
Need to parse PostgreSQL SQL in Go but can't use cgo? Deploying to Alpine containers, Lambda, ARM, scratch images, or anywhere that requires CGO_ENABLED=0?
postgresparser works everywhere go build works. It parses SQL into a structured intermediate representation (IR) that gives you tables, columns, joins, filters, CTEs, subqueries, and more — without executing anything.
result, err := postgresparser.ParseSQL(`
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.active = true
GROUP BY u.name
ORDER BY order_count DESC
`)
fmt.Println(result.Command) // "SELECT"
fmt.Println(result.Tables) // users, orders with aliases
fmt.Println(result.Columns) // u.name, COUNT(o.id) AS order_count
fmt.Println(result.Where) // ["u.active=true"]
fmt.Println(result.JoinConditions) // ["o.user_id=u.id"]
fmt.Println(result.GroupBy) // ["u.name"]
fmt.Println(result.ColumnUsage) // each column with its role: filter, join, projection, group, orderFor multi-statement behavior and strict mode, see Statement Count Handling.
batch, err := postgresparser.ParseSQLAll(`
CREATE TABLE public.api_key (
id integer NOT NULL
);
CREATE TABLE public.sometable (
id integer NOT NULL
);`)
if err != nil {
log.Fatal(err)
}
fmt.Println(len(batch.Statements)) // 2
fmt.Println(batch.Statements[0].Query.DDLActions[0].ObjectName) // "api_key"
fmt.Println(batch.Statements[1].Query.DDLActions[0].ObjectName) // "sometable"Performance: With SLL prediction mode, most queries parse in 70–350 µs.
go get github.com/valkdb/postgresparser- Query linting — detect missing WHERE on DELETEs, flag SELECT *, enforce naming conventions
- Dependency extraction — map which tables and columns a query touches, build lineage graphs
- Migration tooling — parse DDL to understand schema changes, diff CREATE statements
- Audit logging — tag log entries with structured metadata (tables, operation type, filtered columns)
- Query rewriting — inject tenant filters, add audit columns, transform SQL before execution
- Index advisors — analyze column usage patterns to suggest optimal indexes
Handles the SQL you actually write in production:
- DML: SELECT, INSERT, UPDATE, DELETE, MERGE
- DDL: CREATE TABLE (columns/type/nullability/default + PK/FK/UNIQUE/CHECK constraints), CREATE INDEX, DROP TABLE/INDEX, ALTER TABLE, TRUNCATE, COMMENT ON
- CTEs:
WITH ... ASincludingRECURSIVE, materialization hints - JOINs: INNER, LEFT, RIGHT, FULL, CROSS, NATURAL, LATERAL
- Subqueries: in SELECT, FROM, WHERE, and HAVING
- Set operations: UNION, INTERSECT, EXCEPT (ALL/DISTINCT)
- Upsert: INSERT ... ON CONFLICT DO UPDATE/DO NOTHING
- JSONB:
->,->>,@>,?,?|,?& - Window functions: OVER, PARTITION BY
- Type casts:
::type - Parameters:
$1,$2, ...
IR field reference: ParsedQuery IR Reference Comment extraction guide: Comment Extraction Guide
Use the API variant that matches your input contract:
ParseSQL(sql)parses the first statement only (backward-compatible behavior).ParseSQLAll(sql)parses all statements and returnsParseBatchResultwith oneStatements[i]result per input statement (Index,RawSQL,Query,Warnings).- A statement failed conversion when
Statements[i].Query == nil. - Correlation is deterministic:
Statements[i].Indexmaps to source statement order. HasFailuresistruewhen any statement has a nilQueryor anyWarnings.
- A statement failed conversion when
ParseSQLStrict(sql)requires exactly one statement and returnsErrMultipleStatementswhen input contains more than one.ParseSQLWithOptions(sql, opts),ParseSQLAllWithOptions(sql, opts), andParseSQLStrictWithOptions(sql, opts)expose optional extraction flags.IncludeCreateTableFieldCommentsenables inline--field-comment extraction inCREATE TABLE.COMMENT ONextraction is always enabled.
See docs/supported-statements.md for full details on parsed commands, graceful handling (e.g. SET/SHOW/RESET), and what's currently UNKNOWN or unsupported.
| Category | Statements | Status |
|---|---|---|
| DML | SELECT, INSERT, UPDATE, DELETE, MERGE | Full IR extraction |
| DDL | CREATE TABLE, ALTER TABLE, DROP TABLE/INDEX, CREATE INDEX, TRUNCATE, COMMENT ON | Full IR extraction |
| Utility | SET, SHOW, RESET | Graceful — returns UNKNOWN, no error |
| Other | GRANT, REVOKE, CREATE VIEW/FUNCTION/TRIGGER, COPY, EXPLAIN, VACUUM, BEGIN/COMMIT/ROLLBACK, etc. | Not yet supported — may error or return UNKNOWN |
The analysis subpackage provides higher-level intelligence on top of the IR:
analysis.AnalyzeSQL(sql)analyzes the first statement only (matchesParseSQL).analysis.AnalyzeSQLAll(sql)analyzes all statements and returnsSQLAnalysisBatchResult.analysis.AnalyzeSQLStrict(sql)requires exactly one statement (matchesParseSQLStrict).
Know exactly how every column is used — filtering, joining, projection, grouping, ordering:
result, err := analysis.AnalyzeSQL("SELECT o.id, c.name FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.status = 'active'")
for _, cu := range result.ColumnUsage {
fmt.Printf("%s.%s → %s\n", cu.TableAlias, cu.Column, cu.UsageType)
}
// o.id → projection
// c.name → projection
// o.customer_id → join
// c.id → join
// o.status → filterPull structured conditions with operators and values:
conditions, _ := analysis.ExtractWhereConditions("SELECT * FROM orders WHERE status = 'active' AND total > 100")
for _, c := range conditions {
fmt.Printf("%s %s %v\n", c.Column, c.Operator, c.Value)
}
// status = active
// total > 100AnalyzeSQL exposes the syntactic role of each ? or $N placeholder without
re-parsing or scanning SQL text:
result, _ := analysis.AnalyzeSQL("SELECT * FROM t WHERE id = ? LIMIT ?")
for _, p := range result.Placeholders {
fmt.Printf("placeholder %d: role=%s\n", p.Index, p.Role)
}
// Output:
// placeholder 1: role=where_value
// placeholder 2: role=limitRoles cover common placeholder positions such as predicate values, function
arguments, GROUP BY and ORDER BY ordinals, LIMIT, OFFSET, INTERVAL,
array members, INSERT values, and UPDATE SET values. This supports generic
use cases such as ORM type-checking, query rewriting, SQL linting, and
normalized-SQL inspection.
Pass in your schema metadata and get back foreign key relationships — no heuristic guessing:
schema := map[string][]analysis.ColumnSchema{
"customers": {
{Name: "id", PGType: "bigint", IsPrimaryKey: true},
{Name: "name", PGType: "text"},
},
"orders": {
{Name: "id", PGType: "bigint", IsPrimaryKey: true},
{Name: "customer_id", PGType: "bigint"},
},
}
joins, _ := analysis.ExtractJoinRelationshipsWithSchema(
"SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id",
schema,
)
// orders.customer_id → customers.idFor CREATE TABLE parsing, see examples/ddl/.
With SLL prediction mode, postgresparser parses most queries in 70–350 µs with minimal allocations. The IR extraction layer accounts for only ~3% of CPU — the rest is ANTLR's grammar engine, which SLL mode keeps fast.
See the Performance Guide for benchmarks, profiling results, and optimization details.
See the examples/ directory:
basic/— Parse SQL and inspect the IRanalysis/— Column usage, WHERE conditions, JOIN relationshipsddl/— Parse CREATE TABLE / ALTER TABLE plus DELETE command metadatamulti_statement/— CorrelateParseSQLAlloutput back to each input statement and detect failures (Query == nil)sll_mode/— SLL prediction mode for maximum throughput
Built on ANTLR4 grammar files in grammar/. To regenerate after modifying:
antlr4 -Dlanguage=Go -visitor -listener -package gen -o gen grammar/PostgreSQLLexer.g4 grammar/PostgreSQLParser.g4This is an ANTLR4-based grammar, not PostgreSQL's internal server parser. Some edge-case syntax may differ across PostgreSQL versions. If you find a query that parses in PostgreSQL but fails here, please open an issue with a minimal repro.
ParseSQL processes the first SQL statement for backward compatibility. For multi-statement input, use ParseSQLAll; to enforce exactly one statement, use ParseSQLStrict.
Apache License 2.0 — see LICENSE for details.