feat: Modular v2 enhancements — 12 gaps + modernization#83
feat: Modular v2 enhancements — 12 gaps + modernization#83
Conversation
Analyzed existing codebase against all 4 plans: - TenantGuard: ~50% exists (context, service, config, decorators) - Dynamic Reload: ~25% exists (observer, field tracking, config providers) - Aggregate Health: ~15% exists (reverseproxy health checker) - BDD/Contract Testing: ~65% exists (121 tests, contract CLI, CI) Revised checklists to only cover remaining work. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… emission Implements HealthProvider interface, HealthReport/AggregatedHealth types, provider adapters (simple, static, composite), and AggregateHealthService with fan-out evaluation, panic recovery, cache TTL, temporary error detection, and CloudEvent emission on status changes. 17 tests including concurrency and race-detector verified. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces TenantGuard interface and StandardTenantGuard implementation with strict/lenient/disabled modes, whitelist support, ring buffer violation tracking, and CloudEvents emission on violations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…and rollback Add Reloadable interface to module.go for modules that support runtime config reloading. Implement ReloadOrchestrator with single-flight execution, exponential backoff circuit breaker, reverse-order rollback on partial failure, and CloudEvents emission for reload lifecycle. New files: - reload.go: ChangeType, ConfigChange, FieldChange, ConfigDiff, ReloadTrigger types - reload_orchestrator.go: ReloadOrchestrator with queue, circuit breaker, rollback - reload_test.go: comprehensive tests (ConfigDiff, orchestrator, rollback, circuit breaker, concurrency) Also fix pre-existing WithLogger name collision in health_service.go -> WithHealthLogger. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace dynamic errors.New() with sentinel errors (err113) - Add StatusUnknown case to exhaustive switch (exhaustive) - Derive request context from parent context (contextcheck) - Wrap interface method error return (wrapcheck) - Fix gofmt alignment in builder.go, contract_verifier.go, observer.go - Update letsencrypt go.sum for httpserver@v1.12.0 checksum Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reload orchestrator: - Sort targets by module name for deterministic reload/rollback order - Make Stop() idempotent with sync.Once, reject requests after stop - Move noop check before started event to avoid misleading event stream - Apply default 30s timeout when ReloadTimeout() returns <= 0 - Add doc comment noting application integration is a follow-up Health service: - Switch from *log.Logger to modular.Logger for consistent structured logging - Return deep copies from Check() to prevent cache mutation by callers - Use NewCloudEvent helper for proper event ID/specversion - Prefer StatusUnhealthy over StatusUnknown on tie-breaks Tenant guard: - Switch from *log.Logger to modular.Logger for consistent structured logging - Deep-copy whitelist in constructor, convert to set for O(1) lookups - Use NewCloudEvent helper for proper event ID/specversion - Use structured logging (Warn with key-value pairs) instead of Printf - Wire guard into application service registry via builder Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add ErrReloadStopped for stopped orchestrator (distinct from channel full) - Guard against nil logger with nopLogger fallback in NewReloadOrchestrator - Fix BenchmarkReload to call processReload directly instead of queue+drain - Update rollback test to assert deterministic sorted order Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- reload_orchestrator: add recover guard in RequestReload to prevent panic on send-to-closed-channel race with Stop() - builder: use app.RegisterService() instead of direct SvcRegistry map mutation for tenant guard registration - reload_contract_bdd_test: make rollback assertion deterministic now that targets are sorted by name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- reload_orchestrator: extract handleReload() to properly scope context lifetime (no defer cancel leak in loop), propagate req.Ctx values and cancellation to module Reload calls instead of only copying deadline - health_service: worstStatus now maps StatusUnknown → StatusUnhealthy in output, consistent with documented aggregation behavior - reload_test: replace all time.Sleep waits with polling waitFor helper for deterministic CI-safe synchronization - reload_contract_bdd_test: replace time.Sleep waits with event/call polling helpers (bddWaitForEvent, bddWaitForCalls) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep context chain rooted in parentCtx and apply request deadline via context.WithDeadline instead of swapping the base context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- contract_verifier: guard checkReloadIdempotent with goroutine+select so a misbehaving module cannot block the verifier indefinitely - health_service: Check now selects on ctx.Done() when collecting provider results, returning ctx.Err() on cancellation/timeout - tenant_guard: log NotifyObservers errors instead of silently dropping; update TenantGuardLenient doc to clarify logging is best-effort - reload_contract_bdd_test: simulate elapsed backoff by backdating lastFailure instead of manually clearing circuit breaker state - reload_orchestrator: propagate req.Ctx cancellation via background goroutine watching req.Ctx.Done(), not just deadline Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add ErrReloadTimeout sentinel error, use in contract verifier - Wrap ctx.Err() in health_service.Check for wrapcheck compliance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers 12 gaps identified in framework audit: config-driven deps, drainable shutdown, phase tracking, parallel init, type-safe services, service readiness events, plugin interface, reload integration, config file watcher, secret resolution, slog adapter, metrics hooks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12 tasks covering all audit gaps: config-driven deps, drainable shutdown, phase tracking, parallel init, type-safe services, service readiness, plugin interface, reload integration, secret resolver, config watcher, slog adapter, metrics hooks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hints Allow injecting dependency edges into the module dependency graph from the builder/config level, without requiring modules to implement the DependencyAware interface. Hints are merged into the graph before topological sort, enabling correct init ordering and cycle detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ycle Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add sync.RWMutex to EnhancedServiceRegistry for thread-safe concurrent access - Protect all registry methods (RegisterService, GetService, OnServiceReady, etc.) - Fire readiness callbacks outside the lock to prevent deadlocks - Add Init(modular.Application) to ConfigWatcher to satisfy Module interface - Add ErrDynamicReloadNotEnabled sentinel error - Add []any slice handling to ExpandSecrets for YAML array configs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Applied gopls modernization across 63 files:
- Replace interface{} with any (Go 1.18+)
- Replace reflect.TypeOf((*T)(nil)).Elem() with reflect.TypeFor[T]() (Go 1.22+)
- Replace reflect.Ptr with reflect.Pointer
- Replace manual loops with slices.Contains and maps.Copy
- Replace C-style for loops with range N syntax (Go 1.22+)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR addresses 12 identified gaps in the Modular Go framework, adding new features across core lifecycle, services/plugins, configuration/reload, and observability. It also modernizes the codebase using gopls quickfixes (interface{} → any, reflect.TypeOf → reflect.TypeFor[T], reflect.Ptr → reflect.Pointer, loop modernization).
Changes:
- Adds 12 new framework capabilities: config-driven dependency hints, Drainable interface, phase tracking, parallel init, type-safe service helpers, service readiness events, Plugin interface, ReloadOrchestrator integration, config file watcher, secret resolution, slog adapter, and module metrics hooks
- Introduces thread-safe
EnhancedServiceRegistrywithsync.RWMutex, health service with caching/aggregation, tenant guard with ring buffer, and contract verifier - Modernizes 63+ files with Go idiom updates and adds comprehensive test coverage for all new features
Reviewed changes
Copilot reviewed 99 out of 101 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| go.mod, go.sum | Bumps Go version (to invalid 1.26), adds fsnotify dependency |
| phase.go, drainable.go | New AppPhase enum and Drainable interface |
| plugin.go | New Plugin/PluginWithHooks/PluginWithServices interfaces |
| metrics.go | New MetricsProvider interface and ModuleMetrics struct |
| reload.go | New ConfigChange, ConfigDiff, ReloadTrigger types |
| reload_orchestrator.go | ReloadOrchestrator with circuit breaker and rollback |
| health.go, health_service.go | Health system with providers, aggregation, caching |
| secret_resolver.go | SecretResolver interface and ExpandSecrets utility |
| slog_adapter.go | SlogAdapter wrapping *slog.Logger for Logger interface |
| service_typed.go | Generic RegisterTypedService/GetTypedService helpers |
| service.go | Thread-safe EnhancedServiceRegistry with OnServiceReady |
| tenant_guard.go | TenantGuard with ring buffer, modes, whitelist |
| contract_verifier.go | ContractVerifier for reload/health behavioral contracts |
| builder.go | New builder options (WithModuleDependency, WithDrainTimeout, etc.) |
| module.go | New Reloadable interface |
| errors.go | New sentinel errors for tenant guard and reload |
| observer.go | New event type constants |
| tenant_service.go | slices.Contains refactoring (has syntax issue) |
| observer_test.go | slices.Contains refactoring (has syntax issue) |
| config_validation.go | Modernization + accidentally removed constants |
| logger_decorator_base_bdd_test.go | Replaced errors with blank identifiers |
| config_field_tracking_test.go | switch/case indentation error |
| modules/configwatcher/* | New config file watcher module |
| Various *_test.go files | Tests for all new features + modernization |
| docs/plans/*.md | Updated design/implementation plans |
| features/*.feature | New Gherkin feature files for reload/health contracts |
| Remaining files | interface{} → any, reflect.TypeOf → reflect.TypeFor, loop modernization |
logger_decorator_base_bdd_test.go
Outdated
| _ = errors.New("unexpected message count") | ||
| _ = errors.New("message not found") | ||
| _ = errors.New("argument not found") | ||
| _ = errors.New("unexpected log level") |
There was a problem hiding this comment.
The previously named error variables (errUnexpectedMessageCount, errMessageNotFound, errArgNotFound, errUnexpectedLogLevel) have been replaced with blank identifiers (_). This means 4 errors.New(...) calls are executed at init time and immediately discarded. If these errors are genuinely unused, the entire declarations should be removed rather than assigned to _. If they are used elsewhere, they should retain their names. Additionally, the alignment/spacing of the blank identifier assignments is inconsistent (extra spaces on lines 18 and 19).
There was a problem hiding this comment.
Fixed: removed the 4 discarded errors.New calls assigned to _ and cleaned up alignment.
config_field_tracking_test.go
Outdated
| for _, fp := range tracker.FieldPopulations { | ||
| if fp.InstanceKey == "primary" { | ||
| switch fp.InstanceKey { | ||
| case "primary": |
There was a problem hiding this comment.
The case "primary": on line 400 has incorrect indentation — it's at the leftmost column instead of being aligned with the surrounding switch/case structure. This appears to be a formatting issue introduced during the if/else if to switch refactoring. The code should be properly indented to match the surrounding block.
| case "primary": | |
| case "primary": |
There was a problem hiding this comment.
Fixed: corrected the indentation of case "primary": to align with the surrounding switch structure.
| type nopLogger struct{} | ||
|
|
||
| func (nopLogger) Info(_ string, _ ...any) {} | ||
| func (nopLogger) Error(_ string, _ ...any) {} | ||
| func (nopLogger) Warn(_ string, _ ...any) {} | ||
| func (nopLogger) Debug(_ string, _ ...any) {} |
There was a problem hiding this comment.
There is a nopLogger type defined here, but there's likely already a nopLogger defined elsewhere in the package (it's used in test files like service_typed_test.go line 8 with nopLogger{}). Having duplicate type definitions in the same package will cause a compilation error. One of them should be removed or they should be consolidated into a single definition.
There was a problem hiding this comment.
Verified: nopLogger is only defined once in the package (in reload_orchestrator.go). No duplicate exists.
| if slices.Contains(registration.eventTypes, event.Type()) { | ||
| _ = registration.observer.OnEvent(ctx, event) | ||
| break | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The observer_test.go refactoring of slices.Contains has left orphaned indentation from the original for/if block. The lines following the if slices.Contains(...) call (lines 200-201) still have the extra indentation from the removed inner block, and there's a dangling } that was part of the removed for loop. This will cause compilation errors.
There was a problem hiding this comment.
Fixed: corrected the orphaned indentation and braces left from the slices.Contains refactoring.
| func (w *ConfigWatcher) stopWatching() { | ||
| w.stopOnce.Do(func() { | ||
| close(w.stopCh) | ||
| if w.watcher != nil { | ||
| w.watcher.Close() | ||
| } | ||
| }) |
There was a problem hiding this comment.
The configwatcher module's Stop method takes a context.Context parameter (Stop(_ context.Context) error), but the Stoppable interface in the core framework defines Stop with a context.Context parameter. Meanwhile, the Start method also takes context.Context. However, looking at the core Stoppable interface, it should be Stop(ctx context.Context) error. The watcher silently ignores errors from fsnotify.Watcher.Close() in stopWatching(). If Close() returns an error, it should be propagated via the Stop method's return value.
There was a problem hiding this comment.
Fixed: stopWatching() now returns the error from watcher.Close(), and Stop() propagates it to the caller.
config_validation.go
Outdated
| tagRequired = "required" | ||
| tagValidate = "validate" | ||
| tagDesc = "desc" // Used for generating sample config and documentation | ||
|
|
There was a problem hiding this comment.
The removed constants tagValidate and tagDesc appear to have been accidentally deleted. The tagDesc constant is documented as "Used for generating sample config and documentation" in the removed comment. If these constants are still referenced elsewhere in the codebase (e.g., in GenerateSampleConfig or validation logic), this will cause compilation errors. Additionally, a blank line was left at line 20 where the constants were, which looks like an incomplete edit.
| tagValidate = "validate" | |
| tagDesc = "desc" |
There was a problem hiding this comment.
Fixed: restored tagValidate and tagDesc constants. They are reserved for validation and sample config generation.
| svcRegistry := app.SvcRegistry() | ||
| raw, exists := svcRegistry[name] | ||
| if !exists { | ||
| return zero, fmt.Errorf("%w: %s", ErrServiceNotFound, name) | ||
| } |
There was a problem hiding this comment.
The GetTypedService function accesses app.SvcRegistry() which returns a ServiceRegistry (a map[string]any). This is a snapshot/copy from the EnhancedServiceRegistry.AsServiceRegistry() method, which now acquires a read lock. However, calling SvcRegistry() for a simple service lookup is inefficient because AsServiceRegistry() creates a new map and copies all entries on every call. Consider using app.GetService() internally instead, or adding a method to the Application interface that provides direct typed access without the map copy overhead.
There was a problem hiding this comment.
Added a doc comment noting the performance trade-off. The Application interface only exposes GetService(name, target) which uses reflection-based assignment, so the generic GetTypedService approach with SvcRegistry() is the most practical for compile-time type safety. An interface change would be needed to avoid the map copy.
benchmark_test.go
Outdated
| svcReg := registry.AsServiceRegistry() | ||
|
|
||
| b.ResetTimer() | ||
| for b.Loop() { | ||
| _ = svcReg["bench-service"] |
There was a problem hiding this comment.
The BenchmarkServiceLookup benchmark calls registry.AsServiceRegistry() once and then benchmarks a map index operation. This doesn't measure service lookup performance through the actual GetService method with its RWMutex locking. Since the registry now has locking, a direct map index on a pre-computed snapshot doesn't reflect real-world lookup cost. Consider benchmarking registry.GetService("bench-service") instead.
| svcReg := registry.AsServiceRegistry() | |
| b.ResetTimer() | |
| for b.Loop() { | |
| _ = svcReg["bench-service"] | |
| b.ResetTimer() | |
| for b.Loop() { | |
| _, _ = registry.GetService("bench-service") |
There was a problem hiding this comment.
Fixed: benchmark now calls registry.GetService("bench-service") to measure the actual lookup path including RWMutex locking, rather than a pre-computed map snapshot.
tenant_service.go
Outdated
| if slices.Contains(ts.tenantAwareModules, module) { | ||
| ts.logger.Debug("Module already registered as tenant-aware", | ||
| "module", fmt.Sprintf("%T", module), "name", module.Name()) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The slices.Contains refactoring has left behind orphaned closing braces and indentation from the original for/if block. The code inside the if slices.Contains(...) block still has the old indentation level from the removed inner if block, and there's a dangling } on line 173 that was part of the removed for loop. This will likely cause a compilation error due to mismatched braces.
There was a problem hiding this comment.
Fixed: corrected the orphaned indentation and braces from the slices.Contains refactoring in RegisterTenantAwareModule.
- Remove discarded error vars assigned to _ in logger_decorator_base_bdd_test.go - Fix indentation in config_field_tracking_test.go switch case - Fix orphaned indentation from slices.Contains refactoring in observer_test.go and tenant_service.go - Restore tagValidate and tagDesc constants in config_validation.go - Propagate fsnotify.Watcher.Close() error in configwatcher Stop method - Benchmark GetService through registry lock instead of pre-copied map - Add performance note to GetTypedService doc comment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep HEAD (ours) for all modernization changes (range N, wg.Go, slices.Contains, maps.Copy, any, t.Context, TypeFor) and v2 enhancements (builder fields, plugin support, drain timeout, parallel init, dynamic reload, phase events). Incorporate new helper methods from main (reset(), getLastChanges(), getEvents()). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📋 API Contract Changes SummaryChanged Components:Core FrameworkContract diff saved to artifacts/diffs/core.json Module: authContract diff saved to artifacts/diffs/auth.json Module: cacheContract diff saved to artifacts/diffs/cache.json Module: chimuxContract diff saved to artifacts/diffs/chimux.json Module: databaseContract diff saved to artifacts/diffs/database.json Module: eventbusContract diff saved to artifacts/diffs/eventbus.json Module: eventloggerContract diff saved to artifacts/diffs/eventlogger.json Module: httpclientContract diff saved to artifacts/diffs/httpclient.json Module: httpserverContract diff saved to artifacts/diffs/httpserver.json Module: jsonschemaContract diff saved to artifacts/diffs/jsonschema.json Module: letsencryptContract diff saved to artifacts/diffs/letsencrypt.json Module: logmaskerContract diff saved to artifacts/diffs/logmasker.json Module: reverseproxyContract diff saved to artifacts/diffs/reverseproxy.json Module: schedulerContract diff saved to artifacts/diffs/scheduler.json Artifacts📁 Full contract diffs and JSON artifacts are available in the workflow artifacts. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
application.go
Outdated
| if app.enhancedSvcRegistry != nil { | ||
| app.enhancedSvcRegistry.SetCurrentModule(module) | ||
| } | ||
|
|
||
| if err := module.Init(appToPass); err != nil { | ||
| if app.enhancedSvcRegistry != nil { | ||
| app.enhancedSvcRegistry.ClearCurrentModule() | ||
| } | ||
| return fmt.Errorf("module '%s' failed to initialize: %w", moduleName, err) | ||
| } | ||
|
|
||
| if _, ok := module.(ServiceAware); ok { | ||
| for _, svc := range module.(ServiceAware).ProvidesServices() { | ||
| if err := app.RegisterService(svc.Name, svc.Instance); err != nil { | ||
| if app.enhancedSvcRegistry != nil { | ||
| app.enhancedSvcRegistry.ClearCurrentModule() | ||
| } | ||
| return fmt.Errorf("module '%s' failed to register service '%s': %w", moduleName, svc.Name, err) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if app.enhancedSvcRegistry != nil { | ||
| app.enhancedSvcRegistry.ClearCurrentModule() |
There was a problem hiding this comment.
Race condition: In parallel init, SetCurrentModule/ClearCurrentModule calls from concurrent goroutines will overwrite each other. The EnhancedServiceRegistry.currentModule is a single shared field, so when modules A and B init concurrently: goroutine A calls SetCurrentModule(modA), then goroutine B calls SetCurrentModule(modB) — now when modA calls RegisterService, it will be associated with modB.
The initMu only guards injectServices/moduleRegistry writes, not the SetCurrentModule → Init → RegisterService → ClearCurrentModule sequence. The entire sequence from SetCurrentModule through ClearCurrentModule needs to be atomic per module, or currentModule tracking needs to be made goroutine-local (e.g., using a goroutine-local context or passing the module directly to RegisterService).
There was a problem hiding this comment.
Fixed: added RegisterServiceForModule() to EnhancedServiceRegistry which takes the module directly, eliminating the shared currentModule race. initModule() now uses this instead of the SetCurrentModule/ClearCurrentModule pattern, so concurrent goroutines during parallel init can never associate services with the wrong module. Commit 357f242.
Add RegisterServiceForModule() to EnhancedServiceRegistry that takes the module directly instead of relying on the shared currentModule field. initModule() now uses this method, making parallel init safe from the SetCurrentModule/ClearCurrentModule race where concurrent goroutines could associate services with the wrong module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All CI workflows now use Go ^1.26 to match go.mod requirements. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…iles Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| @@ -94,16 +130,34 @@ func (r *EnhancedServiceRegistry) RegisterService(name string, service any) (str | |||
| // Register the service | |||
| r.services[actualName] = entry | |||
|
|
|||
| // Collect callbacks to fire outside the lock | |||
| var callbacksToFire []func(any) | |||
| for _, cbName := range []string{name, actualName} { | |||
| if callbacks, ok := r.readyCallbacks[cbName]; ok { | |||
| callbacksToFire = append(callbacksToFire, callbacks...) | |||
| delete(r.readyCallbacks, cbName) | |||
| } | |||
| } | |||
|
|
|||
| // Track module associations | |||
| if moduleName != "" { | |||
| r.moduleServices[moduleName] = append(r.moduleServices[moduleName], actualName) | |||
| } | |||
|
|
|||
| r.mu.Unlock() | |||
|
|
|||
| // Fire callbacks outside the lock to avoid deadlocks | |||
| for _, cb := range callbacksToFire { | |||
| cb(service) | |||
| } | |||
|
|
|||
| return actualName, nil | |||
| } | |||
There was a problem hiding this comment.
The registerServiceLocked function manually unlocks r.mu at line 147 before firing callbacks, but if a panic occurs between acquiring the lock (at the caller) and reaching line 147, the mutex will never be unlocked. All other methods in this struct use defer r.mu.Unlock() or defer r.mu.RUnlock(). Consider restructuring to use defer for the critical section (e.g., extract the locked portion into a helper that returns the callbacks, then fire them after the helper returns).
| for len(placed) < len(order) { | ||
| var level []string | ||
| for _, name := range order { | ||
| if placed[name] { | ||
| continue | ||
| } | ||
| // Check if all deps are placed | ||
| ready := true | ||
| for dep := range deps[name] { | ||
| if !placed[dep] { | ||
| ready = false | ||
| break | ||
| } | ||
| } | ||
| if ready { | ||
| level = append(level, name) | ||
| } | ||
| } | ||
| for _, name := range level { | ||
| placed[name] = true | ||
| } | ||
| levels = append(levels, level) | ||
| } | ||
| return levels |
There was a problem hiding this comment.
computeDepthLevels can enter an infinite loop if any module in order has a dependency that is not itself in order. In that case, the module can never become "ready" (its dep is never placed), level will be empty, and the for len(placed) < len(order) loop will spin forever. Add a guard to break out with an error if no progress is made in an iteration (i.e., len(level) == 0).
| func (w *ConfigWatcher) eventLoop() { | ||
| var timer *time.Timer | ||
| changedPaths := make(map[string]struct{}) | ||
| var mu sync.Mutex | ||
|
|
||
| for { | ||
| select { | ||
| case event, ok := <-w.watcher.Events: | ||
| if !ok { | ||
| return | ||
| } | ||
| if event.Has(fsnotify.Write) || event.Has(fsnotify.Create) { | ||
| mu.Lock() | ||
| changedPaths[event.Name] = struct{}{} | ||
| if timer != nil { | ||
| timer.Stop() | ||
| } | ||
| timer = time.AfterFunc(w.debounce, func() { | ||
| if w.onChange != nil { | ||
| mu.Lock() | ||
| paths := make([]string, 0, len(changedPaths)) | ||
| for p := range changedPaths { | ||
| paths = append(paths, p) | ||
| } | ||
| changedPaths = make(map[string]struct{}) | ||
| mu.Unlock() | ||
| w.onChange(paths) | ||
| } | ||
| }) | ||
| mu.Unlock() | ||
| } |
There was a problem hiding this comment.
The configwatcher module's eventLoop accesses changedPaths from both the main select loop goroutine and the time.AfterFunc callback goroutine. While a mutex protects the access sites, the changedPaths variable is reassigned in the AfterFunc callback (changedPaths = make(map[string]struct{})) without the outer goroutine's select loop being aware. After the AfterFunc fires and resets changedPaths, the next event case in the select loop will add to the newly created map while the timer callback may still be running. The mu mutex mitigates some of this, but the changedPaths variable itself is captured by closure and reassigned, which is fragile. Consider using a single owner pattern or clearing the map (clear(changedPaths)) instead of reassigning the variable.
| func (r *EnhancedServiceRegistry) GetServicesByModule(moduleName string) []string { | ||
| r.mu.RLock() | ||
| defer r.mu.RUnlock() | ||
| return r.moduleServices[moduleName] | ||
| } |
There was a problem hiding this comment.
The GetServicesByModule method returns the internal slice directly without copying. With the new sync.RWMutex protecting the registry, callers could receive a reference to the internal slice and read/modify it concurrently with other operations that append to it (e.g., RegisterService). All other read methods either return scalar values or create copies. This should return a copy of the slice for consistency and thread safety.
Summary
Addresses all 12 gaps identified in the Modular framework audit, making it a more complete foundation for the Workflow engine and other consumers. Also modernizes the entire codebase using gopls quickfixes.
Core Lifecycle (Section 1)
WithModuleDependency(from, to)builder option injects edges into the dependency graph before resolutionPreStop(ctx)called beforeStop()for graceful drain of in-flight work, with configurableWithDrainTimeout(d)Phase()method returningAppPhaseenum (Created → Initializing → Initialized → Starting → Running → Draining → Stopping → Stopped), emitsEventTypeAppPhaseChangedCloudEventsWithParallelInit()initializes modules at the same topological depth concurrently via errgroupServices & Plugins (Section 2)
RegisterTypedService[T]/GetTypedService[T]generic functions with compile-time type safetyOnServiceReady(name, callback)fires immediately if registered or defers until registrationPlugin/PluginWithHooks/PluginWithServicesinterfaces withWithPlugins(...)builder optionConfiguration & Reload (Section 3)
WithDynamicReload()wires the orchestrator into the app lifecyclemodules/configwatcherpackage using fsnotify with debounced change detectionSecretResolverinterface +ExpandSecrets()for${prefix:path}patterns (handles maps, nested maps, and slices)Observability (Section 4)
SlogAdapterwrapping*slog.LoggerwithWith()/WithGroup()chainingMetricsProviderinterface +CollectAllMetrics(ctx)for vendor-neutral metric collectionCode Quality
EnhancedServiceRegistrywithsync.RWMutex(safe for parallel init)ErrDynamicReloadNotEnabledsentinel errorModuleinterfaceinterface{}→any,reflect.TypeOf→reflect.TypeFor[T],reflect.Ptr→reflect.Pointer, manual loops →slices.Contains/maps.Copy, C-style for →range NDesign
See:
docs/plans/2026-03-09-modular-v2-enhancements-design.mdImplementation Plan
See:
docs/plans/2026-03-09-modular-v2-enhancements.mdTest plan
go test -count=1 -race ./...passes (root + configwatcher packages)go build ./...compiles cleanfeeders/yaml.govet errors are unrelated to this PR🤖 Generated with Claude Code