fix: concurrency/correctness fixes in refresh pipeline + auto-refresh improvements#95
Merged
devnullvoid merged 7 commits intodevnullvoid:developfrom Mar 8, 2026
Merged
Conversation
Merge from original
Feature/clustergroup
When no default_profile is configured and no --profile flag or PVETUI_PROFILE env var is provided, the bootstrap now prompts the user interactively instead of hard-coding a fallback to the literal name 'default' (which would error if no such profile existed). - Remove the implicit 'default' name fallback in ResolveProfile; empty string is returned to signal that interactive selection is needed. - Add promptProfileSelection: numbered terminal list (groups first, then individual profiles) with re-prompt on bad input and clean exit on EOF/Ctrl+C. Single profile with no groups is auto-selected silently. - Pre-check in bootstrap: if default_profile resolves to a name that no longer exists (stale after rename/delete), warn and fall back to the interactive chooser instead of hard-erroring. - setDefaultProfile now toggles: pressing s on the already-default entry clears default_profile to '' so the chooser runs on next startup. - Connection Profiles dialog re-opens after s so the updated star indicator is immediately visible. - Help bar updated: 's:toggle default'.
profiles: add interactive startup chooser and toggle-off default
Replace atomic.Bool refresh guard with generation token (atomic.Uint64) to prevent stale callbacks from clearing another refresh's guard. Add sync.RWMutex (connMu) protecting connection-state fields (client, groupManager, isGroupMode, isClusterMode) that were read by background goroutines without synchronization. Specific issues fixed: - WP1: refreshInProgress atomic.Bool replaced by refreshGen atomic.Uint64 with startRefresh/forceNewRefresh/endRefresh helpers; generation token ensures only the owning callback can release the guard. - WP2: connMu RWMutex guards all connection-state field writes; background goroutines use snapConn() to get a consistent lock-safe snapshot. deactivateGroupModes restructured to close resources outside the lock. - WP3: enrichGroupNodesParallel accumulates enriched nodes in a local slice; single swap into GlobalState inside QueueUpdateDraw eliminates the concurrent write-to-shared-slice data race. - WP4a: autoRefreshDataWithFooter captures UI selections (GetSelectedVM, GetSelectedNode, GetItemCount) on the UI goroutine before spawning the background goroutine; passed as selSnap struct to avoid tview reads off-thread. - WP4b: doFastRefresh VM enrichment callback stale guard changed from 'refreshGen != token' to 'current != 0 && current != token' so it aborts only on a newer refresh, not on normal completion; this fixes the header loading spinner never clearing and VM details not updating for the selected item after profile switch. - WP4c: Retry errors in GetFastFreshClusterStatus guest agent retry loop are now aggregated into enrichErr and surfaced to the onEnrichmentComplete callback instead of being silently dropped.
Auto-refresh called GetLightClusterStatus() / GetGroupClusterResources() which return VMs with basic metrics only. The resulting VM objects were written directly into GlobalState, silently zeroing guest agent data (IPs, network interfaces, filesystems) and config details (boot order, CPU cores, storage devices, OS type, description) on every cycle. Add preserveVMEnrichmentData() helper that builds an O(N) lookup map from existing GlobalState VMs and carries all enrichment fields forward onto freshly-fetched VMs. Applied in both single-profile and group mode paths, matching the node-detail preservation pattern already in place. Group mode auto-refresh called enrichGroupNodesParallel(), which makes individual API calls to every node (version, disks, load avg, etc.) and takes 10-20+ seconds on multi-node setups. The refresh guard was held for the full duration, so the countdown timer always saw isRefreshActive == true at expiry, reset to 10, and skipped — then on the next expiry triggered another full enrichment. Result: perpetual spinner, countdown never completed, auto-refresh effectively broken. Remove enrichGroupNodesParallel from the auto-refresh path. Auto-refresh is now lightweight in both modes: metrics update, enriched data is preserved, UI lists and details update immediately, and the guard is released synchronously inside QueueUpdateDraw. Full node enrichment continues to run on manual refresh (R key) and profile switches only.
Fix/concurrency correctness
Owner
|
Reviewed and tested locally against Local result:
Follow-up fixes applied locally:
Validation run locally on the merged
Local commits on top of
This PR should target |
Owner
|
Included commits on
This will be included in the next release. Thanks for the contribution. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
Interactive startup profile chooser
internal/bootstrap/bootstrap.gopackage encapsulates startup profile resolution logicRefresh pipeline rework
atomic.Bool refreshInProgresswithatomic.Uint64 refreshGengeneration token; addedstartRefresh/forceNewRefresh/endRefreshhelperssync.RWMutex connMuprotectingclient,groupManager,isGroupMode,isClusterMode; all background goroutines usesnapConn()for race-free readsenrichGroupNodesParallelnow accumulates results in a local slice and swaps intoGlobalStatein a singleQueueUpdateDrawcallbackautoRefreshDataWithFootercaptures UI selection state (captureSelections()/selSnap) on the UI goroutine before spawning the background goroutinecurrent != 0 && current != tokenso it aborts only on a newer refresh, not on normal completionGetFastFreshClusterStatusare now aggregated intoenrichErrand passed toonEnrichmentCompleteforceNewRefresh+doFastRefreshinstead ofmanualRefreshdoManualRefreshanddoFastRefreshsplit into separate functions;doEnrichNodesandenrichGroupNodesParallelaccept a generation tokenAuto-refresh
preserveVMEnrichmentData()helper that carries guest agent data (IPs, network interfaces, filesystems) and config details (boot order, CPU cores, storage devices, OS type, description) forward fromGlobalStateonto freshly-fetched VMs on each auto-refresh cycleenrichGroupNodesParallelfrom the group mode auto-refresh path; auto-refresh is now lightweight in both modes and releases the refresh guard synchronouslytime.NewTicker; spinner goroutine listens onautoRefreshCountdownStopinstead of a separate channel