fix: enable Spock repair mode for service user creation by rshoemaker · Pull Request #287 · pgEdge/control-plane

rshoemaker · 2026-03-06T15:52:28Z

fix: resolve concurrent service user creation failures

Summary

Enable Spock repair mode during service user role creation to prevent replication conflicts in multi-node database topologies
Add retry with exponential backoff to handle transient tuple concurrently updated errors when multiple service user roles are created concurrently against the same Postgres instance
Connect to the application database (where Spock is installed) instead of postgres so repair mode is correctly enabled

Context

TestProvisionMultiHostMCPService was failing intermittently with:

ERROR: tuple concurrently updated (SQLSTATE XX000)

When a database has multiple MCP service instances (e.g., on host-1, host-2, host-3), each gets its own ServiceUserRole resource. These execute concurrently on the same Postgres instance, and their GRANT statements can conflict at the catalog level. The fix addresses this with two layers:

Repair mode prevents Spock from replicating role/grant DDL in multi-node topologies
Retry (3 attempts, 500ms exponential backoff) handles same-instance concurrent catalog modifications

Verified with 30 consecutive test runs, 0 failures.

Test plan

Unit tests pass (make test, 640 tests)
TestProvisionMultiHostMCPService passes 30/30 runs (previously failed ~2-3/10)

PLAT-478

Wrap service user role creation in a transaction with Spock repair mode enabled. This prevents "tuple concurrently updated" errors when the same user is being created on multiple instances simultaneously and Spock replicates the changes.

coderabbitai · 2026-03-06T15:59:14Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3c75db42-18e5-4f54-9045-604488275cfe

📥 Commits

Reviewing files that changed from the base of the PR and between c216d57 and 3f4a388.

📒 Files selected for processing (1)

server/internal/orchestrator/swarm/service_user_role.go

📝 Walkthrough

Walkthrough

Refactors service user role creation to retry the create flow on transient errors, perform all DDL and grant statements inside a transaction connected to the target database, enable Spock repair mode when available, and add explicit commit error handling. connectToPrimary now accepts a database name.

Changes

Cohort / File(s)	Summary
Service user role implementation `server/internal/orchestrator/swarm/service_user_role.go`	Reworked create flow to retry full user creation on transient SQLSTATE XX000 errors and moved role creation into a new `createUserRole` method. All DDL and GRANT statements now execute on a transaction handle (Begin/Commit with rollback on error). Added Spock repair-mode check/enable within the transaction. Added explicit commit error wrapping. Updated `connectToPrimary` signature to accept `dbName` and adjusted call sites to pass `r.DatabaseName` for create and `"postgres"` for delete. Switched DSN construction from `AdminDSN("postgres")` to `AdminDSN(dbName)`. (+47/-7)

Poem

🐇 I hop through code with careful paws,
Began a tx, avoided flaws,
Enabled Spock, then rolled no more,
Committed safe on database floor,
Retry, log, and then applause!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change—enabling Spock repair mode for service user creation to fix concurrent tuple-update errors.
Description check	✅ Passed	The description covers most required sections (Summary, Changes, Test plan), follows Conventional Commits format, and provides comprehensive context including issue link (PLAT-478) and test results.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/PLAT-478/svc_user_tx_with_repair_mode

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/internal/orchestrator/swarm/service_user_role.go`:
- Around line 130-133: The call to postgres.IsSpockEnabled().Scalar silently
hides query errors because Scalar currently returns nil on row.Scan failures;
update the Scalar implementation in the postgres statement (the method that
implements Scalar on the Statement/Query type) so that any row.Scan error is
returned to the caller instead of nil, ensuring postgres.IsSpockEnabled().Scalar
propagates the scan/DB error back to the caller (so the existing error handling
in service_user_role.go around enabled, err := ... Scalar(ctx, tx) works as
intended).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c1ad74e6-e8db-4423-8957-d107629152ec

📥 Commits

Reviewing files that changed from the base of the PR and between 87528bb and c216d57.

📒 Files selected for processing (1)

server/internal/orchestrator/swarm/service_user_role.go

server/internal/orchestrator/swarm/service_user_role.go

Wrap createUserRole in a retry loop (3 attempts, exponential backoff) to handle transient "tuple concurrently updated" errors when multiple service user roles are created concurrently against the same Postgres instance. Also connect to the application database instead of "postgres" so that Spock repair mode is correctly enabled (Spock is installed per-database).

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

server/internal/orchestrator/swarm/service_user_role.go Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: enable Spock repair mode for service user creation#287

fix: enable Spock repair mode for service user creation#287
rshoemaker wants to merge 2 commits intomainfrom
fix/PLAT-478/svc_user_tx_with_repair_mode

rshoemaker commented Mar 6, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 6, 2026 •

edited

Loading

Walkthrough

Changes

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rshoemaker commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rshoemaker commented Mar 6, 2026 •

edited

Loading

coderabbitai bot commented Mar 6, 2026 •

edited

Loading