Conversation
Add pg_upgrade support functions for PostgreSQL major version upgrades NOTE: This PR was created with AI tools and a human. The ag_graph.namespace column uses the regnamespace type, which pg_upgrade cannot handle in user tables. This commit adds four SQL functions to enable seamless PostgreSQL major version upgrades while preserving all graph data. New functions in ag_catalog: - age_prepare_pg_upgrade(): Converts namespace from regnamespace to oid, creates backup table with graph-to-namespace mappings - age_finish_pg_upgrade(): Remaps stale OIDs after upgrade, restores regnamespace type, invalidates AGE caches - age_revert_pg_upgrade_changes(): Cancels preparation if upgrade is aborted - age_pg_upgrade_status(): Returns current upgrade readiness status Usage: 1. Before pg_upgrade: SELECT age_prepare_pg_upgrade(); 2. Run pg_upgrade as normal 3. After pg_upgrade: SELECT age_finish_pg_upgrade(); The functions include automatic cache invalidation by touching graph namespaces, ensuring cypher queries work immediately without requiring a session reconnect. Files changed: - sql/age_pg_upgrade.sql: New file with function implementations - sql/sql_files: Added age_pg_upgrade entry - age--1.7.0--y.y.y.sql: Added functions for extension upgrades All regressions tests passed.
There was a problem hiding this comment.
Pull request overview
Adds PostgreSQL pg_upgrade helper functions to Apache AGE to work around regnamespace being unsupported by pg_upgrade in user tables, enabling major-version upgrades while preserving graph metadata.
Changes:
- Introduces
age_prepare_pg_upgrade(),age_finish_pg_upgrade(),age_revert_pg_upgrade_changes(), andage_pg_upgrade_status()inag_catalog. - Registers the new SQL file in the extension build/install process.
- Adds the same functions to the extension upgrade script for
1.7.0 -> y.y.y.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 17 comments.
| File | Description |
|---|---|
sql/sql_files |
Adds the new age_pg_upgrade SQL script to the extension SQL composition list. |
sql/age_pg_upgrade.sql |
Implements the four pg_upgrade support functions (prepare/finish/revert/status). |
age--1.7.0--y.y.y.sql |
Adds the same functions to the extension upgrade path script. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| PERFORM pg_catalog.pg_advisory_lock(hashtext('age_finish_pg_upgrade')); | ||
| BEGIN | ||
| -- Touch each graph's namespace to invalidate caches | ||
| DECLARE | ||
| graph_rec RECORD; | ||
| BEGIN | ||
| FOR graph_rec IN SELECT namespace::text AS ns_name FROM ag_catalog.ag_graph | ||
| LOOP | ||
| EXECUTE format('ALTER SCHEMA %I OWNER TO CURRENT_USER', graph_rec.ns_name); | ||
| END LOOP; | ||
| END; | ||
| END; | ||
| PERFORM pg_catalog.pg_advisory_unlock(hashtext('age_finish_pg_upgrade')); | ||
|
|
There was a problem hiding this comment.
The cache invalidation uses pg_advisory_lock()/pg_advisory_unlock() (session-level). If an error occurs while looping/altering schemas, the unlock won't execute and the session can keep the lock, blocking future calls. Prefer pg_advisory_xact_lock() or wrap the block with EXCEPTION handling that always unlocks.
| CREATE FUNCTION ag_catalog.age_prepare_pg_upgrade() | ||
| RETURNS void | ||
| LANGUAGE plpgsql | ||
| SET search_path = ag_catalog, pg_catalog | ||
| AS $function$ |
There was a problem hiding this comment.
No regression coverage was added for the new pg_upgrade helper functions. Since the repo has an extensive regress suite, consider adding a small regression test that exercises: status reporting, prepare converting ag_graph.namespace to oid, revert restoring regnamespace, and idempotency/expected errors (finish without backup, etc.).
| g.namespace::regnamespace::text AS namespace_name | ||
| FROM ag_catalog.ag_graph g; |
There was a problem hiding this comment.
The backup stores namespace_name as g.namespace::regnamespace::text, which can include quoted identifiers. age_finish_pg_upgrade() later joins this to pg_namespace.nspname (unquoted), so remapping can fail for schemas that require quoting. Store pg_namespace.nspname instead (e.g., join via namespace::oid) to make the mapping stable.
| g.namespace::regnamespace::text AS namespace_name | |
| FROM ag_catalog.ag_graph g; | |
| n.nspname AS namespace_name | |
| FROM ag_catalog.ag_graph g | |
| JOIN pg_namespace n ON n.oid = g.namespace::oid; |
| -- Create temporary mapping table with old and new OIDs | ||
| CREATE TEMP TABLE _graphid_mapping AS | ||
| SELECT | ||
| b.old_graphid, | ||
| b.graph_name, | ||
| n.oid AS new_graphid | ||
| FROM public._age_pg_upgrade_backup b | ||
| JOIN pg_namespace n ON n.nspname = b.namespace_name; | ||
|
|
||
| GET DIAGNOSTICS mapping_count = ROW_COUNT; | ||
|
|
||
| IF mapping_count = 0 THEN | ||
| RAISE EXCEPTION 'No OID mappings found. Schema names may have changed.'; | ||
| END IF; |
There was a problem hiding this comment.
The join to pg_namespace uses n.nspname = b.namespace_name, but namespace_name was stored from regnamespace::text (potentially quoted), so this join can fail and drop mappings. Consider normalizing the stored schema name and validating that every backup row has a mapping (e.g., compare mapping_count to backup row count and raise if mismatched).
| -- Clean up temporary mapping table | ||
| DROP TABLE _graphid_mapping; | ||
| DROP TABLE public._age_pg_upgrade_backup; | ||
|
|
There was a problem hiding this comment.
age_finish_pg_upgrade() drops public._age_pg_upgrade_backup before the schema restoration + cache invalidation steps run. If a later step fails, the backup is gone and recovery/reruns are harder. Consider dropping the backup only after all steps complete successfully.
| PERFORM pg_catalog.pg_advisory_lock(hashtext('age_revert_pg_upgrade')); | ||
| BEGIN | ||
| DECLARE | ||
| graph_rec RECORD; | ||
| BEGIN | ||
| FOR graph_rec IN SELECT namespace::text AS ns_name FROM ag_catalog.ag_graph | ||
| LOOP | ||
| EXECUTE format('ALTER SCHEMA %I OWNER TO CURRENT_USER', graph_rec.ns_name); | ||
| END LOOP; | ||
| END; | ||
| END; | ||
| PERFORM pg_catalog.pg_advisory_unlock(hashtext('age_revert_pg_upgrade')); | ||
|
|
There was a problem hiding this comment.
age_revert_pg_upgrade_changes() uses session-level advisory locks without exception safety (lock may remain held on error) and alters schema ownership to CURRENT_USER (persistent side effect / possible privilege failure). Prefer xact-level advisory locks and an invalidation method that doesn't leave owners changed (and avoid identifier double-quoting).
| PERFORM pg_catalog.pg_advisory_lock(hashtext('age_revert_pg_upgrade')); | |
| BEGIN | |
| DECLARE | |
| graph_rec RECORD; | |
| BEGIN | |
| FOR graph_rec IN SELECT namespace::text AS ns_name FROM ag_catalog.ag_graph | |
| LOOP | |
| EXECUTE format('ALTER SCHEMA %I OWNER TO CURRENT_USER', graph_rec.ns_name); | |
| END LOOP; | |
| END; | |
| END; | |
| PERFORM pg_catalog.pg_advisory_unlock(hashtext('age_revert_pg_upgrade')); | |
| PERFORM pg_catalog.pg_advisory_xact_lock(hashtext('age_revert_pg_upgrade')); | |
| <<invalidate_caches>> | |
| DECLARE | |
| graph_rec RECORD; | |
| ns_owner text; | |
| BEGIN | |
| FOR graph_rec IN | |
| SELECT namespace::text AS ns_name | |
| FROM ag_catalog.ag_graph | |
| LOOP | |
| SELECT n.nspowner::regrole::text | |
| INTO ns_owner | |
| FROM pg_namespace n | |
| WHERE n.nspname = graph_rec.ns_name; | |
| IF ns_owner IS NOT NULL THEN | |
| EXECUTE format( | |
| 'ALTER SCHEMA %s OWNER TO %s', | |
| pg_catalog.quote_ident(graph_rec.ns_name), | |
| pg_catalog.quote_ident(ns_owner) | |
| ); | |
| END IF; | |
| END LOOP; | |
| END invalidate_caches; |
| PERFORM pg_catalog.pg_advisory_lock(hashtext('age_finish_pg_upgrade')); | ||
| BEGIN | ||
| -- Touch each graph's namespace to invalidate caches | ||
| DECLARE | ||
| graph_rec RECORD; | ||
| BEGIN | ||
| FOR graph_rec IN SELECT namespace::text AS ns_name FROM ag_catalog.ag_graph | ||
| LOOP | ||
| EXECUTE format('ALTER SCHEMA %I OWNER TO CURRENT_USER', graph_rec.ns_name); | ||
| END LOOP; | ||
| END; | ||
| END; | ||
| PERFORM pg_catalog.pg_advisory_unlock(hashtext('age_finish_pg_upgrade')); | ||
|
|
There was a problem hiding this comment.
The cache invalidation uses pg_advisory_lock()/pg_advisory_unlock() (session-level locks). If an error occurs inside the BEGIN block, the unlock won't execute and the session can retain the lock, blocking future runs. Prefer pg_advisory_xact_lock() (auto-released at transaction end) or add an EXCEPTION handler that always unlocks.
| FOR graph_rec IN SELECT namespace::text AS ns_name FROM ag_catalog.ag_graph | ||
| LOOP | ||
| EXECUTE format('ALTER SCHEMA %I OWNER TO CURRENT_USER', graph_rec.ns_name); |
There was a problem hiding this comment.
Using ALTER SCHEMA ... OWNER TO CURRENT_USER as a "touch" is not side-effect free: it can permanently change schema ownership and may fail if CURRENT_USER isn't allowed to change the owner. Also, namespace::text for regnamespace is already quoted, so using format('%I', ns_name) can over-quote and break for schemas needing quotes. Consider fetching pg_namespace.nspname and current owner, then toggling owner and restoring it (or another approach that preserves ownership), and ensure identifiers are quoted exactly once.
| FOR graph_rec IN SELECT namespace::text AS ns_name FROM ag_catalog.ag_graph | |
| LOOP | |
| EXECUTE format('ALTER SCHEMA %I OWNER TO CURRENT_USER', graph_rec.ns_name); | |
| FOR graph_rec IN | |
| SELECT | |
| n.nspname AS ns_name, | |
| r.rolname AS owner_name | |
| FROM ag_catalog.ag_graph g | |
| JOIN pg_catalog.pg_namespace n ON n.oid = g.namespace | |
| JOIN pg_catalog.pg_roles r ON r.oid = n.nspowner | |
| LOOP | |
| -- Temporarily change owner to CURRENT_USER to trigger cache invalidation, | |
| -- then restore the original owner to preserve permissions. | |
| EXECUTE format( | |
| 'ALTER SCHEMA %I OWNER TO %I', | |
| graph_rec.ns_name, | |
| current_user | |
| ); | |
| EXECUTE format( | |
| 'ALTER SCHEMA %I OWNER TO %I', | |
| graph_rec.ns_name, | |
| graph_rec.owner_name | |
| ); |
| DECLARE | ||
| has_graphs boolean; | ||
| BEGIN | ||
| -- Check if there are any graphs to process | ||
| SELECT EXISTS(SELECT 1 FROM ag_catalog.ag_graph) INTO has_graphs; | ||
|
|
||
| IF NOT has_graphs THEN | ||
| RAISE NOTICE 'No graphs found. Nothing to prepare for pg_upgrade.'; | ||
| RETURN; | ||
| END IF; | ||
|
|
There was a problem hiding this comment.
age_prepare_pg_upgrade() returns early when there are no rows in ag_catalog.ag_graph, but pg_upgrade compatibility depends on the presence of the regnamespace-typed column, not on row count. Consider still converting ag_graph.namespace to oid (backup can be empty) so pg_upgrade succeeds even with zero graphs.
| DECLARE | |
| has_graphs boolean; | |
| BEGIN | |
| -- Check if there are any graphs to process | |
| SELECT EXISTS(SELECT 1 FROM ag_catalog.ag_graph) INTO has_graphs; | |
| IF NOT has_graphs THEN | |
| RAISE NOTICE 'No graphs found. Nothing to prepare for pg_upgrade.'; | |
| RETURN; | |
| END IF; | |
| BEGIN |
| namespace = m.new_graphid | ||
| FROM _graphid_mapping m | ||
| WHERE g.graphid = m.old_graphid; | ||
|
|
There was a problem hiding this comment.
updated_graphs is never set (missing GET DIAGNOSTICS ... = ROW_COUNT after updating ag_graph), so the NOTICE will report NULL and you lose an important verification point. Capture ROW_COUNT after the UPDATE and consider asserting it matches expected mappings.
| GET DIAGNOSTICS updated_graphs = ROW_COUNT; |
Add pg_upgrade support functions for PostgreSQL major version upgrades
NOTE: This PR was created with AI tools and a human.
The ag_graph.namespace column uses the regnamespace type, which pg_upgrade cannot handle in user tables. This commit adds four SQL functions to enable seamless PostgreSQL major version upgrades while preserving all graph data.
New functions in ag_catalog:
Usage:
The functions include automatic cache invalidation by touching graph namespaces, ensuring cypher queries work immediately without requiring a session reconnect.
Files changed:
All regressions tests passed.