Add tenant migration script across organizations by nevil-mathew · Pull Request #881 · ELEVATE-Project/user

nevil-mathew · 2026-03-10T14:25:42Z

Summary by CodeRabbit

Chores
- Added a tenant data migration capability to support organizational data transfers, including validation, external data remapping, role management, and post-migration verification with dry-run support.

nevil-mathew · 2026-03-10T14:28:00Z

@coderabbitai full review

coderabbitai · 2026-03-10T14:28:53Z

✅ Actions performed

Full review triggered.

coderabbitai · 2026-03-10T14:29:07Z

Walkthrough

A new migration script is introduced that orchestrates moving tenant data across organizations. The script handles configuration validation, database operations, external entity remapping, role management, transactional data copying/moving, post-migration verification, and supports dry-run mode with extensive logging throughout the process.

Changes

Cohort / File(s)	Summary
Tenant Data Migration Script `src/scripts/migrateTenantOrgData.js`	Introduces a comprehensive migration workflow (2433 lines) with config/validation, database utilities, external data fetching and remapping, context preparation, role mapping strategies (strict-id and map-by-title), copy/move operations for forms/entities/templates/users, cleanup with soft/hard/none delete modes, post-migration validation, dry-run support, serializable transactions with locking, and robust error handling with rollback on failure.

Sequence Diagram

sequenceDiagram
    actor User
    participant Script as Migration Script
    participant DB as Database
    participant ExtSvc as External Services
    participant TxnMgr as Transaction Manager

    User->>Script: Execute migration (source, target tenant, options)
    Script->>Script: Validate config & arguments
    activate Script
    
    Script->>TxnMgr: Start SERIALIZABLE transaction
    activate TxnMgr
    
    Script->>DB: Apply locks on source/target tables
    activate DB
    DB-->>Script: Locks acquired
    deactivate DB
    
    Script->>DB: Query source tenant context (users, roles, orgs)
    DB-->>Script: Source context data
    
    Script->>DB: Query target tenant context
    DB-->>Script: Target context data
    
    Script->>ExtSvc: Fetch external entity mappings (parallel batches)
    activate ExtSvc
    ExtSvc-->>Script: External ID mappings
    deactivate ExtSvc
    
    Script->>Script: Remap external metadata for users
    Script->>Script: Resolve role mappings (strategy-dependent)
    
    alt Dry-Run Mode
        Script->>Script: Log planned copy/move/delete operations
        Script-->>User: Report dry-run results
    else Execute Mode
        Script->>DB: Copy forms, entity types, entities to target
        DB-->>Script: Copy complete
        
        Script->>DB: Move users, user_organizations, user_roles to target
        DB-->>Script: Move complete
        
        Script->>DB: Apply role mappings to moved users
        DB-->>Script: Mappings applied
        
        Script->>DB: Delete/soft-delete source rows (per cleanup mode)
        DB-->>Script: Cleanup complete
        
        Script->>DB: Verify migrated data counts & role consistency
        DB-->>Script: Validation results
        
        Script->>Script: Validate post-migration state
    end
    
    alt Validation Successful
        Script->>TxnMgr: Commit transaction
        TxnMgr->>DB: Commit all changes
        TxnMgr-->>Script: Transaction committed
        Script-->>User: Migration succeeded
    else Validation Failed
        Script->>TxnMgr: Rollback transaction
        TxnMgr->>DB: Rollback all changes
        TxnMgr-->>Script: Transaction rolled back
        Script-->>User: Migration failed (error details)
    end
    
    deactivate TxnMgr
    deactivate Script

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Partial unique index on user_roles; composite lookups, new validators #834: Modifies external-entity mapping logic and composite-key normalization that the migration script depends on for correct external ID lookups during metadata remapping.

Poem

🐰 Hopping through tenants, organizing with care,
Migrations now flow with precision and flair,
External IDs mapped, roles aligned just right,
Transactions ensure no data takes flight—
A rabbit's delight: data moving with grace! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately summarizes the main change: introducing a tenant migration script for migrating data across organizations, which is the primary focus of the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch user-tenant-migration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/scripts/migrateTenantOrgData.js`:
- Around line 2318-2368: The external I/O call remapExternalMetaForUsers is
executed while a SERIALIZABLE transaction and locks are held (tx created and
lockByStrategy called), which can prolong locks; move the
remapExternalMetaForUsers call out of the write transaction by invoking it
before opening the transaction and applying locks (or immediately after
buildContextAndPrecheck but outside any tx), assign its usersForInsert back onto
context, and keep the same external_meta_remap_ready logging; update any call
sites (remapExternalMetaForUsers(context)) so it does not depend on tx or locked
resources and ensure context is passed/returned unchanged except for
usersForInsert.
- Around line 90-109: The usage text in printUsage incorrectly shows "node
scripts/migrateTenantOrgData.js"; update the first usage example string to the
correct entrypoint "node src/scripts/migrateTenantOrgData.js" (i.e., change the
path in the template literal inside the printUsage function) so the help output
points to the actual file location.
- Around line 1220-1254: The target-role SELECT used for strict-id (the
querySelect that populates targetRolesByIdRows) currently includes soft-deleted
rows; modify that SQL to exclude deleted roles (e.g., add a WHERE clause
condition like ur.deleted_at IS NULL or equivalent) so the Map created as
targetRoleById only contains active roles, and then re-run the same change for
the other analogous query that builds targetRolesByIdRows in the later block
(the other strict-id path around the code referenced as applying to 1448-1459)
to ensure strict-id never matches soft-deleted roles.
- Around line 1079-1115: The query that builds sourceRolesFromUsers currently
un-nests users.roles without ensuring those role IDs actually belong to the
source org(s), so stale cross-org role IDs can be pulled in; update the SQL used
for sourceRolesFromUsers (the querySelect that assigns sourceRolesFromUsers) to
only select role IDs that exist in user_roles for the source tenant and allowed
organization scope (e.g., JOIN user_roles ur ON ur.id = unnest(roles) and join
organizations o to ensure ur.organization_id / o.code matches the source/default
org), so requiredRoleIds / requiredRoleIdArray only contains role IDs that truly
belong to the source org before fetching sourceRoleRows and building
sourceRoleById.
- Around line 1395-1446: After rebasing user_roles.id via the loops that use
queryRaw (see nextFreeRoleId, tempSeed, targetToSourceIdMap, tx), run a final
queryRaw inside the same transaction to reset the table's backing sequence to at
least the current max(id) for that tenant; call
setval(pg_get_serial_sequence('user_roles','id'), coalesce((SELECT MAX(id) FROM
user_roles WHERE tenant_code = $targetTenant AND deleted_at IS NULL), 1), true)
(or equivalent) passing targetTenant and tx so subsequent inserts won't reuse
updated ids.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8af91057-976c-4e6e-bdcc-afd837d57ec1

📥 Commits

Reviewing files that changed from the base of the PR and between 4991c25 and 60be852.

⛔ Files ignored due to path filters (2)

AGENTS.md is excluded by !**/*.md
src/scripts/readme.md is excluded by !**/*.md

📒 Files selected for processing (1)

src/scripts/migrateTenantOrgData.js

coderabbitai · 2026-03-10T14:41:59Z