pg-dump-v17
postgres-compat specs/postgres-compat/pg-dump-v17.kmd
Specification body
pg-dump v17 — Wire Compatibility Spec
Normative spec for what pg_dump --version 17 expects from a Postgres-wire
server. Used as the contract that infra/data/kdb (kdb-pgwire / kdb-gateway)
must satisfy to support pg_dump round-trips, and as the source-of-truth
for the per-table coverage work tracked in infra/data/kdb#341.
1. Provenance
The catalogue of queries below was captured live from PostgreSQL 17.9
(Debian 17.9-0+deb13u1) using
infra/data/kdb/scripts/pgdump-v17-capture.sh.
The script spins a throw-away PG 17 cluster, applies a varied schema
(5 tables, 4 indexes including partial, 2 explicit sequences, FK + CHECK
constraints, view, materialized view, enum type), and runs pg_dump
under log_statement = all so the server records the exact stream of
statements pg_dump issued.
Captures are checked in under
meta/docs/stack/specs/postgres-compat/captures/pg-dump-v17-queries-<flag>.sql
for the following flag combinations:
| File suffix | Flags |
|---|---|
default | (no flags) |
schema-only | --schema-only |
data-only | --data-only |
no-owner-no-acl | --no-owner --no-acl |
table-users | --table=users |
Re-capture by running the script — output is byte-stable across runs of
the same PG 17 minor (pg_dump does not emit non-deterministic queries).
2. Session prelude (every invocation)
The first ~13 statements in every pg_dump run are session-shape GUCs.
Every server claiming pg_dump-compat MUST accept these without error
(returning SET / SELECT is sufficient; the values can be ignored if
the server has no equivalent knob):
SELECT pg_catalog.set_config('search_path', '', false);
SELECT pg_catalog.pg_is_in_recovery();
SELECT pg_catalog.set_config('search_path', '', false);
SET DATESTYLE = ISO;
SET INTERVALSTYLE = POSTGRES;
SET extra_float_digits TO 3;
SET synchronize_seqscans TO off;
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET transaction_timeout = 0;
SET row_security = off;
SELECT set_config(name, 'view, foreign-table', false)
FROM pg_settings
WHERE name = 'restrict_nonsystem_relation_kind';
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ ONLY;
The final two open a read-only repeatable-read transaction that wraps the entire dump; servers without MVCC snapshots can accept these as no-ops provided they expose a stable read for the duration of the dump.
3. Catalogue queries — required surface
pg_dump v17 issues queries against the following pg_catalog relations.
Servers MUST respond with the row shape PostgreSQL 17 produces; missing
columns cause pg_dump to bail with a parser error in its
PQgetvalue(... , <col>) calls. Functions in bold are out-of-band
helpers (not relations) — they MUST exist and return the documented type.
3.1 Relations (must be queryable, even if empty)
| Relation | Min. columns pg_dump v17 reads | Notes |
|---|---|---|
pg_roles | oid, rolname | Empty result OK if --no-owner |
pg_extension | tableoid, oid, extname, extrelocatable, extversion, extconfig, extcondition | Empty result OK; pg_dump iterates the list |
pg_depend | classid, objid, refobjid (+ refclassid filter, deptype filter) | Drives extension-membership detection |
pg_namespace | tableoid, oid, nspname, nspowner, nspacl (+ acldefault('n', nspowner)) | At minimum public and pg_catalog |
pg_class | tableoid, oid, relname, relnamespace, relkind, reltype, relowner, relchecks, relhasindex, relhasrules, relpages, relhastriggers, relpersistence, reloftype, relacl, relfrozenxid, relreplident, relrowsecurity, relforcerowsecurity, relminmxid, reloptions, relispopulated, relispartition | Drives the entire object stream — see §4 below |
pg_proc | tableoid, oid, proname, prolang, pronargs, proargtypes, prorettype, proacl, pronamespace, proowner, prokind | Filter prokind <> 'a' (non-aggregates) |
pg_type | tableoid, oid, typname, typnamespace, typacl, typowner, typelem, typrelid, typtype, typisdefined | Used to resolve column types |
pg_language | tableoid, oid, lanname, lanpltrusted, lanplcallfoid, laninline, lanvalidator, lanacl, lanowner | Filter WHERE lanispl |
pg_attribute | attname, atttypmod, attstorage, attcompression, attmissingval, attstattarget, attndims, attlen, attbyval, attalign, attnotnull, atthasdef, attidentity, attgenerated, attisdropped, attislocal, attinhcount | R-CRITICAL — column shape drives CREATE TABLE reconstruction |
pg_constraint | oid, conname, contype, conrelid, conindid, connamespace, confupdtype, confdeltype, confmatchtype, condeferrable, condeferred, convalidated, conkey, confkey, confrelid | CHECK / FK / UNIQUE reconstruction |
pg_index | indrelid, indkey, indcollation, indclass, indoption, indpred, indexprs, indisclustered, indisreplident, indisvalid, indisready, indislive | Drives CREATE INDEX reconstruction |
pg_inherits | inhrelid, inhparent, inhseqno | Partition / inheritance tree |
pg_description | objoid, classoid, objsubid, description | COMMENT ON … reconstruction |
pg_publication | oid, pubname, pubowner, puballtables, pubinsert, pubupdate, pubdelete, pubtruncate, pubviaroot | Empty result OK |
pg_settings | name, setting | set_config(...) FROM pg_settings WHERE name = '…' |
pg_foreign_table | ftrelid, ftserver | Empty result OK if no FDWs |
3.2 Relations newly required by PG 17 (gaps vs PG 15/16)
These either did not exist or had different shape in earlier PG and are likely to be missing in a server claiming PG 13/14 compatibility:
| Relation | Why pg_dump v17 needs it |
|---|---|
pg_cast | Resolves implicit-cast paths during type reconstruction |
pg_collation | Per-column / per-index collation names |
pg_conversion | Server-encoding conversions |
pg_default_acl | Default privileges from ALTER DEFAULT PRIVILEGES … |
pg_event_trigger | Event triggers (DDL hooks) — empty result usually OK |
pg_init_privs | Captures the "original ACL" assigned by an extension |
pg_opclass | Index operator class lookup |
pg_operator | Operator definitions (mostly transitive via opfamily) |
pg_opfamily | Index operator family |
pg_publication_namespace | Publication-by-schema (PG 15+) |
pg_range | Range type bounds |
pg_rewrite | View / materialized-view definitions (joined with pg_get_viewdef) |
pg_seclabels | Security labels (SECURITY LABEL …) |
pg_sequence | Sequence parameters (PG 10+ split-out of pg_class.relkind = 'S') |
pg_statistic_ext | Extended statistics objects |
pg_tablespace | Non-default tablespaces — usually empty in cloud setups |
pg_transform | Transform mappings (e.g. plpython hstore) |
pg_ts_config / pg_ts_dict / pg_ts_parser / pg_ts_template | Full-text-search assets |
3.3 Functions
| Function | Return | When pg_dump v17 calls it |
|---|---|---|
pg_is_in_recovery() | bool | Session prelude — almost certainly returns false |
pg_get_viewdef(oid) | text | View / materialized-view body |
pg_get_indexdef(oid) | text | CREATE INDEX … reconstruction |
pg_get_constraintdef(oid) | text | CHECK / FK / UNIQUE body |
pg_get_triggerdef(oid) | text | Trigger reconstruction |
pg_get_expr(node, relid) | text | Default-value / partial-index predicate body |
pg_get_serial_sequence(table, col) | text | Reidrata SERIAL / IDENTITY columns |
pg_options_to_table(text[]) | setof | FDW server OPTIONS decoding |
acldefault(char, oid) | aclitem[] | Default-ACL synthesis (called inline in §3.1 queries) |
array_remove(any[], any) | any[] | Used inline to drop check_option=… from reloptions |
3.4 PG 17 row-shape additions
The big v15→v17 movement in pg_attribute matters:
attstorageis"char"in v17 (wasoidhistorically; some legacy shims still return oid — pg_dump v17 will reject those rows).attcompressionexists from v14+ ("char", default'p'for pglz).attmissingvalisanyarray(often emitted as text).attstattargetisint2(became nullable in v17 —NULLallowed).attgeneratedis"char"('s'for STORED,''for none).attidentityis"char"('a'for ALWAYS,'d'for BY DEFAULT,''for none).
pg_class.relrowsecurity and relforcerowsecurity are bool (RLS-related).
pg_class.relispartition is bool (PG 10+); MUST exist even on
non-partitioned tables (default false).
4. Object-fetch loop (the heart of pg_dump)
After the session prelude, pg_dump runs one large pg_class query to
materialize the entire table-of-objects:
SELECT c.tableoid, c.oid, c.relname, c.relnamespace, c.relkind, c.reltype,
c.relowner, c.relchecks, c.relhasindex, c.relhasrules, c.relpages,
c.relhastriggers, c.relpersistence, c.reloftype, c.relacl,
acldefault(CASE WHEN c.relkind = 'S' THEN 's'::"char" ELSE 'r'::"char" END,
c.relowner) AS acldefault,
CASE WHEN c.relkind = 'f'
THEN (SELECT ftserver FROM pg_catalog.pg_foreign_table WHERE ftrelid = c.oid)
ELSE 0 END AS foreignserver,
c.relfrozenxid, tc.relfrozenxid AS tfrozenxid, tc.oid AS toid,
tc.relpages AS toastpages, tc.reloptions AS toast_reloptions,
d.refobjid AS owning_tab, d.refobjsubid AS owning_col,
tsp.spcname AS reltablespace, false AS relhasoids,
c.relispopulated, c.relreplident, c.relrowsecurity,
c.relforcerowsecurity, c.relminmxid, tc.relminmxid AS tminmxid,
array_remove(array_remove(c.reloptions, 'check_option=local'),
'check_option=cascaded') AS reloptions,
CASE WHEN 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL'::text
WHEN 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED'::text
ELSE NULL END AS checkoption,
am.amname, (d.deptype = 'i') IS TRUE AS is_identity_sequence,
c.relispartition AS ispartition
FROM pg_class c …
(Truncated — see captures/pg-dump-v17-queries-schema-only.sql for the
full text.) The relkind values returned drive the rest of the dump:
relkind | Followed by |
|---|---|
'r' | pg_attribute → pg_constraint → pg_index → LOCK TABLE … ACCESS SHARE → COPY … TO STDOUT (data-only) |
'v' | pg_get_viewdef(oid) |
'm' | pg_get_viewdef(oid) (materialized) + REFRESH MATERIALIZED VIEW directive |
'S' | pg_sequence row + setval('…', last_value, is_called) |
'i' | Index — usually emitted by following pg_index |
'p' | Partitioned table — followed by pg_inherits to enumerate children |
'c' | Composite type — pg_dump may skip if the type is system-managed |
'f' | Foreign table — pulls the server name from the foreignserver correlated subquery |
The LOCK TABLE … IN ACCESS SHARE MODE is non-negotiable: servers MUST
either implement table locks or accept the syntax as a no-op (returning
LOCK TABLE). Without it, pg_dump v17 aborts the dump.
5. Data path (--data-only)
For each non-system table whose relkind = 'r', pg_dump issues:
COPY public.users (id, handle, email, score, created_at,
tags, metadata, avatar, status) TO STDOUT;
The column list is materialized from the pg_attribute query above. The
server MUST emit the response on the wire as
CopyOutResponse → CopyData* → CopyDone → CommandComplete("COPY n").
Text format is the default; binary is not used by pg_dump v17.
Implementation in kdb-pgwire: ticket infra/data/kdb#340.
6. Out-of-scope for v17 compat
- Foreign-key reconstruction across schemas with
--schema=…— pg_dump v17 chases FKs to outside-schema parents; the audit script pins topublicso the captures do not exercise this corner. - Server-side
COMMENT ON LARGE OBJECT— large-object dump uses a separate code path (--blobs/--no-blobs) not exercised by the audit. Tracked separately if/when lobs become a goal. - Parallel pg_dump (
-j N) — parallel restore uses extra COPY sessions but the per-session query stream is the same. pg_dump --inserts— usesINSERT INTO … VALUES (…)instead of COPY. Audit captures focus on the default COPY path.
7. Versioning
Captures are pinned to PG 17.9. Re-run the capture script when:
- PostgreSQL 18 ships and
pg_dumpv18 diverges (verify with a fresh capture against PG 18 in/usr/lib/postgresql/18/bin/). - A user reports a pg_dump failure on kdb-gateway that the diff report does not cover.
See kdb-gateway-v0.0.13-vs-pg17-pgdump.md
for the per-query coverage delta against the current kdb-pgwire surface.
8. Conformance status (2026-05-11)
The 8 originally-identified R-CRITICAL gaps have been closed across a
multi-day series of fixes culminating in the FK round-trip milestone.
Latest harness state (scripts/pg-dump-roundtrip.sh):
| Check | State |
|---|---|
pg_dump --schema-only exit code | 0 ✓ |
pg_dump --data-only exit code | 0 ✓ |
psql -f kdb-schema.sql (PG 17) | rc=0 ✓ |
psql -f kdb-data.sql (PG 17) | rc=0 ✓ |
| Row-count round-trip | 100% match ✓ |
| FK guard (orphan INSERT rejected) | blocked (enforced) ✓ |
| Schema diff lines | 35 (was 77 originally) |
Closed work-items
#341umbrella — pg_catalog coverage for the 8 R-CRITICAL gaps#342— round-trip test harness#343— extended-queryset_configemission#344(16 waves) — universaltableoid+ per-table column gaps#345— plannerIS TRUE / IS FALSE#346— subquery catalog inheritance#347—IS DISTINCT FROM#348— text[N]subscript#349—ARRAY (SELECT …)constructor#350—Text → Array<T>cast#351—pg_catalog.prefix-strip for aggregates#353— FK round-trip umbrella (PK ALTER, FK ALTER, conindid linkage)#357— regclass cast → oid#360— FK capture under tenant config#361— synthetic pg_index + pg_class for PK/UNIQUE
Open polish items (post-milestone)
#352— schema-diff cosmetic cleanup (35 lines remaining;\restrictrandom tokens unavoidable; public SCHEMA placeholder block deferred)#359— public SCHEMA block suppression (requires pg_dump v17 source vendoring to map the suppression algorithm)
Regression coverage
infra/data/kdb/crates/kdb-pgwire/tests/pgwire_fk_roundtrip.rs
guards the milestone's three load-bearing wirings (see registry
entries 518-520).