Skip to content

feat: replace MCP env-var config with bind-mounted YAML config files#280

Merged
rshoemaker merged 6 commits intomainfrom
feat/PLAT-444/mcp_service_config
Mar 5, 2026
Merged

feat: replace MCP env-var config with bind-mounted YAML config files#280
rshoemaker merged 6 commits intomainfrom
feat/PLAT-444/mcp_service_config

Conversation

@rshoemaker
Copy link
Contributor

@rshoemaker rshoemaker commented Mar 2, 2026

Replace environment-variable-based MCP service configuration with CP-generated YAML config files (config.yaml, tokens.yaml, users.yaml) bind-mounted into the service container at /app/data/.

config.yaml is CP-owned and regenerated on every update, carrying database connection, LLM provider, tool toggles, and pool settings. tokens.yaml and users.yaml are bootstrap-once: written on initial create if init_token/init_users are provided, then left for the running MCP server to manage. This separates CP provisioning concerns from application-level auth management.

Key changes:

  • Add MCPServiceConfig typed parser with validation for all supported config keys (LLM, embedding, pool, tool toggles, bootstrap auth)
  • Add MCPConfigResource lifecycle (Create writes all three files, Update regenerates only config.yaml, Refresh verifies file exists)
  • Add token/user file generators with SHA-256 and bcrypt hashing
  • Wire MCPConfigResource into the service instance resource chain between DirResource and ServiceInstanceSpec
  • Resolve database host/port from co-located Postgres instance in plan_update for the config.yaml connection block
  • Redact sensitive config keys (API keys, init_users) from API responses
  • Align service user grants with MCP server security guide: public schema only, pg_read_all_settings, no spock access
  • Add TestUpdateMCPServiceConfig E2E test for config update path

PLAT-444 PLAT-462

@coderabbitai
Copy link

coderabbitai bot commented Mar 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR implements MCP service configuration management, adding typed configuration parsing, validation, file generation, and resource lifecycle management. Changes include new MCP config data structures, YAML generators for authentication and configuration files, resource orchestration, and updated service container specs using volume mounts for configuration delivery.

Changes

Cohort / File(s) Summary
MCP Service Configuration
server/internal/database/mcp_service_config.go, server/internal/database/mcp_service_config_test.go
Introduces typed MCP service configuration structures (MCPServiceUser, MCPServiceConfig) and ParseMCPServiceConfig function with comprehensive validation for providers, API keys, embedding settings, numeric ranges, init users, and bootstrap-only field rejection on updates. Includes 840 lines of unit test coverage.
API Validation Refactoring
server/internal/api/apiv1/validate.go, server/internal/api/apiv1/validate_test.go, server/internal/api/apiv1/convert.go
Adds isUpdate parameter propagation through validation chain; delegates MCP config validation to database.ParseMCPServiceConfig; removes legacy field-level validation; treats "init_users" as sensitive config key; updates error messages to reflect new config-level validation approach.
MCP File Generation
server/internal/orchestrator/swarm/mcp_auth_files.go, server/internal/orchestrator/swarm/mcp_auth_files_test.go, server/internal/orchestrator/swarm/mcp_config.go, server/internal/orchestrator/swarm/mcp_config_test.go
Implements YAML generation for MCP token and user authentication files with SHA-256 hashing and bcrypt password encoding; introduces MCPConfigParams and GenerateMCPConfig for producing complete MCP server configuration YAML with HTTP, LLM, embedding, and database settings. Includes 1,710 lines of implementation and test coverage.
MCP Resource Lifecycle
server/internal/orchestrator/swarm/mcp_config_resource.go
Introduces MCPConfigResource with full lifecycle management (Create, Update, Delete, Refresh) and conditional file generation; integrates with existing resource framework and filesystem interfaces; manages config.yaml regeneration and conditional creation of tokens.yaml and users.yaml.
Service Container Updates
server/internal/orchestrator/swarm/service_spec.go, server/internal/orchestrator/swarm/service_instance_spec.go, server/internal/orchestrator/swarm/service_user_role.go
Replaces environment-variable configuration with volume bind mounts; adds DataPath field to service options; removes buildServiceEnvVars function; updates service user role to use LOGIN attribute with explicit database privileges and grants; adds filesystem path resolution for service data directory.
Orchestrator Integration
server/internal/orchestrator/swarm/orchestrator.go, server/internal/orchestrator/swarm/resources.go
Adds early MCP config parsing and MCPConfigResource creation in GenerateServiceInstanceResources; introduces DataDirID to ServiceInstanceSpecResource; registers MCPConfigResource in resource type system; wires data directory dependency into resource chain.
Workflow and End-to-End Tests
e2e/service_provisioning_test.go, server/internal/workflows/activities/generate_service_instance_resources.go, server/internal/workflows/plan_update.go
Adds TestUpdateMCPServiceConfig end-to-end test verifying in-place service updates; changes activity queue selection from manager to host-specific queue; fixes postgres port handling to consistently use internal port 5432.
Dependencies and Licensing
go.mod, NOTICE.txt
Marks golang.org/x/crypto as direct dependency; adds MIT license notice for github.com/goccy/go-yaml v1.18.0.

Poem

🐰 A config that changes without restart—how delightful!
With YAML files mounted and tokens tucked away,
MCP services dance in place while we refresh their way,
PBKDF2 and SHA hashes guard the credentials tight,
From bootstrap to production, every detail done right! 🔐

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.94% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely describes the main architectural change: replacing environment-variable-based MCP configuration with bind-mounted YAML config files.
Description check ✅ Passed The PR description provides comprehensive context with a clear summary, detailed key changes, testing approach, and issue references, though it does not follow the template structure exactly (missing formal Checklist section).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/PLAT-444/mcp_service_config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (3)
server/internal/orchestrator/swarm/service_spec_test.go (1)

148-195: Consider adding checkContainer validation for resource limits test.

This test case sets DataPath (line 176) but doesn't include a checkContainer validation function. While the focus is on resource limits, the absence of container spec validation means regressions in the bind mount or command configuration won't be caught when resource limits are applied.

Consider adding at least a minimal checkContainer to verify the bind mount is still correctly configured when resource limits are present.

Suggested addition
 			DataPath:          "/var/lib/pgedge/services/db1-mcp-server-host1",
 		},
 		wantErr: false,
+		checkContainer: func(t *testing.T, spec *swarm.ContainerSpec) {
+			// Verify bind mount is present even with resource limits
+			if len(spec.Mounts) != 1 {
+				t.Errorf("got %d mounts, want 1", len(spec.Mounts))
+			}
+		},
 		checkResources: func(t *testing.T, resources *swarm.ResourceRequirements) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/service_spec_test.go` around lines 148 -
195, Add a minimal checkContainer to the "service with resource limits" test
case to validate the bind mount is still configured when resource limits are
applied: inside the test case (the one using ServiceContainerSpecOptions with
DataPath set), add a checkContainer func that asserts the created container spec
is non-nil, has mounts, and includes at least one mount whose Source or Target
matches the DataPath value from the test case; keep the existing checkResources
intact so both resource limits (checkResources) and container bind-mount
validation (checkContainer) run for this case.
server/internal/orchestrator/swarm/service_instance_spec.go (1)

62-68: Make the data-dir dependency explicit (and validate DataDirID).

Refresh reads DirResource directly, but Dependencies() relies on an implicit transitive path. Declaring the dir dependency directly makes orchestration order explicit and safer.

♻️ Proposed refactor
 func (s *ServiceInstanceSpecResource) Dependencies() []resource.Identifier {
 	// Service instances depend on the database network, service user role, and MCP config
 	return []resource.Identifier{
 		NetworkResourceIdentifier(s.DatabaseNetworkID),
 		ServiceUserRoleIdentifier(s.ServiceInstanceID),
+		filesystem.DirResourceIdentifier(s.DataDirID),
 		MCPConfigResourceIdentifier(s.ServiceInstanceID),
 	}
 }
@@
 	// Resolve the data directory path from the DirResource
+	if s.DataDirID == "" {
+		return fmt.Errorf("data_dir_id is required")
+	}
 	dataPath, err := filesystem.DirResourceFullPath(rc, s.DataDirID)
As per coding guidelines `server/internal/orchestrator/swarm/**/*.go`: Docker Swarm integration should use bind mounts for configuration and data directories.

Also applies to: 95-99

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/service_instance_spec.go` around lines 62
- 68, Update ServiceInstanceSpecResource to explicitly declare and validate the
data-dir dependency: in Dependencies() add the directory resource identifier
(use DirResourceIdentifier(s.DataDirID) alongside NetworkResourceIdentifier,
ServiceUserRoleIdentifier, MCPConfigResourceIdentifier) so orchestration
ordering is explicit; in Refresh() validate that s.DataDirID is present and that
the DirResource read (DirResource) is available/valid, returning an error if
DataDirID is empty or the DirResource is missing/invalid. Ensure references use
the existing symbols ServiceInstanceSpecResource, Dependencies(), Refresh(),
DataDirID, DirResourceIdentifier and DirResource.
server/internal/database/mcp_service_config.go (1)

84-447: Prefer typed package errors over free-form validation strings.

The parser emits many raw string errors, which makes consistent downstream classification/mapping harder. Consider package-level typed/domain errors with field and reason metadata.

As per coding guidelines "Domain-specific errors should be defined in each package; API errors should be mapped to HTTP status codes via Goa".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/database/mcp_service_config.go` around lines 84 - 447,
Replace free-form fmt.Errorf validation strings with package-level typed/domain
errors and a small structured error type so callers can inspect field and
reason; introduce errors like ErrUnknownKey, ErrMissingField, ErrInvalidType,
ErrInvalidValue, ErrDuplicateUsername (or a single ConfigError struct with Field
and Reason) and return those from ParseMCPServiceConfig helper functions
(validateUnknownKeys, requireString, requireStringForProvider, optionalString,
optionalBool, optionalFloat64, optionalInt, parseInitUsers) instead of raw
strings; update the returned []error values to wrap/contain the field name and a
canonical error kind so downstream code can match on error identity or inspect
Field/Reason metadata. Ensure existing call sites (e.g., ParseMCPServiceConfig)
still accumulate []error but now receive the typed errors to allow mappings to
HTTP codes elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/service_provisioning_test.go`:
- Around line 871-874: The test currently asserts db.ServiceInstances[0].State
== "running" immediately which can flake due to StoreServiceInstance upserting a
transient "creating" state; change the assertion to poll until the state becomes
"running" (keep the require.Len check) by using a retry helper such as
require.Eventually or a small loop with timeout/tick that reads
db.ServiceInstances and checks ServiceInstances[0].State == "running" to allow
the monitor to converge; reference db.ServiceInstances (and the upsert behavior
from StoreServiceInstance) when locating the assertion to replace.

In `@server/internal/database/mcp_service_config.go`:
- Around line 198-214: The validation currently only checks embedding_model and
embedding_api_key when embeddingProvider is present; add a guard to reject
orphan embedding fields when embeddingProvider is nil: if embeddingProvider ==
nil and (embeddingModel != nil || embeddingAPIKey != nil) append appropriate
errors to errs (e.g., "embedding_model must not be set without
embedding_provider" and "embedding_api_key must not be set without
embedding_provider"). Place this check alongside the existing embedding-provider
cross-validation (referencing embeddingProvider, embeddingModel,
embeddingAPIKey, errs and validEmbeddingProviders) so configs with orphan
embedding fields fail validation early.

In `@server/internal/orchestrator/swarm/mcp_config_resource.go`:
- Around line 99-159: The Create and Update methods in MCPConfigResource assume
r.Config is non-nil and later methods (writeConfigFile, writeTokenFileIfNeeded,
writeUserFileIfNeeded) dereference it; add a guard at the start of both Create
and Update after populateCredentials that returns a clear error if r.Config ==
nil (or skip token/user generation when Config is nil), ensuring writeConfigFile
and the token/user writers never receive a nil config; reference
MCPConfigResource, Create, Update, populateCredentials, writeConfigFile,
writeTokenFileIfNeeded, and writeUserFileIfNeeded when locating where to perform
this nil check.
- Around line 90-94: The refresh currently only checks config.yaml via
readResourceFile(dirPath, "config.yaml") so add validation for all managed files
(e.g., tokens.yaml and users.yaml) before returning success; update the logic
around readResourceFile to iterate a list of required filenames (at minimum
"config.yaml", "tokens.yaml", "users.yaml"), call readResourceFile for each
using the same dirPath, and return a wrapped error that includes the missing
filename when any read fails so the Refresh function fails fast on missing
managed files.

In `@server/internal/orchestrator/swarm/mcp_config_test.go`:
- Around line 11-13: Remove the three unused helper functions float64Ptr,
intPtrMCP, and boolPtrMCP from
server/internal/orchestrator/swarm/mcp_config_test.go to satisfy lint; locate
and delete the function declarations for float64Ptr(f float64) *float64,
intPtrMCP(i int) *int, and boolPtrMCP(b bool) *bool, and run tests/linter to
confirm nothing else references them before committing.

In `@server/internal/orchestrator/swarm/mcp_config.go`:
- Around line 95-97: GenerateMCPConfig currently dereferences params.Config
without checks; add input guards at the start of GenerateMCPConfig to verify
params != nil and params.Config != nil and return a clear typed error (not
panic) when invalid. Update callers/tests if needed to handle the returned
error; reference the function name GenerateMCPConfig and the field access
params.Config when locating the change.

In `@server/internal/orchestrator/swarm/service_spec.go`:
- Around line 87-90: Validate opts.DataPath before constructing mounts: if
opts.DataPath is empty, return a clear error immediately instead of calling
docker.BuildMount. Update the code around where mounts is built (the mounts
variable and call to docker.BuildMount) to perform a check like if opts.DataPath
== "" { return fmt.Errorf("data path required for service mount") } so the
function fails fast with a helpful message rather than allowing Docker to fail
later.

---

Nitpick comments:
In `@server/internal/database/mcp_service_config.go`:
- Around line 84-447: Replace free-form fmt.Errorf validation strings with
package-level typed/domain errors and a small structured error type so callers
can inspect field and reason; introduce errors like ErrUnknownKey,
ErrMissingField, ErrInvalidType, ErrInvalidValue, ErrDuplicateUsername (or a
single ConfigError struct with Field and Reason) and return those from
ParseMCPServiceConfig helper functions (validateUnknownKeys, requireString,
requireStringForProvider, optionalString, optionalBool, optionalFloat64,
optionalInt, parseInitUsers) instead of raw strings; update the returned []error
values to wrap/contain the field name and a canonical error kind so downstream
code can match on error identity or inspect Field/Reason metadata. Ensure
existing call sites (e.g., ParseMCPServiceConfig) still accumulate []error but
now receive the typed errors to allow mappings to HTTP codes elsewhere.

In `@server/internal/orchestrator/swarm/service_instance_spec.go`:
- Around line 62-68: Update ServiceInstanceSpecResource to explicitly declare
and validate the data-dir dependency: in Dependencies() add the directory
resource identifier (use DirResourceIdentifier(s.DataDirID) alongside
NetworkResourceIdentifier, ServiceUserRoleIdentifier,
MCPConfigResourceIdentifier) so orchestration ordering is explicit; in Refresh()
validate that s.DataDirID is present and that the DirResource read (DirResource)
is available/valid, returning an error if DataDirID is empty or the DirResource
is missing/invalid. Ensure references use the existing symbols
ServiceInstanceSpecResource, Dependencies(), Refresh(), DataDirID,
DirResourceIdentifier and DirResource.

In `@server/internal/orchestrator/swarm/service_spec_test.go`:
- Around line 148-195: Add a minimal checkContainer to the "service with
resource limits" test case to validate the bind mount is still configured when
resource limits are applied: inside the test case (the one using
ServiceContainerSpecOptions with DataPath set), add a checkContainer func that
asserts the created container spec is non-nil, has mounts, and includes at least
one mount whose Source or Target matches the DataPath value from the test case;
keep the existing checkResources intact so both resource limits (checkResources)
and container bind-mount validation (checkContainer) run for this case.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b81d259 and 4b66b6a.

📒 Files selected for processing (19)
  • e2e/service_provisioning_test.go
  • server/internal/api/apiv1/convert.go
  • server/internal/api/apiv1/validate.go
  • server/internal/api/apiv1/validate_test.go
  • server/internal/database/mcp_service_config.go
  • server/internal/database/mcp_service_config_test.go
  • server/internal/orchestrator/swarm/mcp_auth_files.go
  • server/internal/orchestrator/swarm/mcp_auth_files_test.go
  • server/internal/orchestrator/swarm/mcp_config.go
  • server/internal/orchestrator/swarm/mcp_config_resource.go
  • server/internal/orchestrator/swarm/mcp_config_test.go
  • server/internal/orchestrator/swarm/orchestrator.go
  • server/internal/orchestrator/swarm/resources.go
  • server/internal/orchestrator/swarm/service_instance_spec.go
  • server/internal/orchestrator/swarm/service_spec.go
  • server/internal/orchestrator/swarm/service_spec_test.go
  • server/internal/orchestrator/swarm/service_user_role.go
  • server/internal/workflows/activities/generate_service_instance_resources.go
  • server/internal/workflows/plan_update.go

postgres.Statement{SQL: fmt.Sprintf("GRANT SELECT ON ALL TABLES IN SCHEMA public TO %s;", sanitizeIdentifier(r.Username))},
postgres.Statement{SQL: fmt.Sprintf("ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO %s;", sanitizeIdentifier(r.Username))},
// Allow viewing PostgreSQL configuration via diagnostic tools
postgres.Statement{SQL: fmt.Sprintf("GRANT pg_read_all_settings TO %s;", sanitizeIdentifier(r.Username))},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only missing permission on the existing read_only role? We don't document those canned roles, and I don't know that anyone is actually using it, so I'm wondering if we should just add this permission to that role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but it feels like we're going to need service-specific roles since each service will have it's own baseline of required db permissions.

Copy link
Member

@jason-lynch jason-lynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This looks like a good improvement overall. I left a small request regarding library choices, but otherwise this looks good to go. Please remember to run go mod tidy and make licenses after swapping out gopkg.in/yaml.v3 in case it affects NOTICES.txt

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
server/internal/orchestrator/swarm/mcp_auth_files_test.go (1)

244-256: Strengthen bcrypt uniqueness test to verify salt behavior directly.

Using different plaintext passwords will always tend to produce different hashes; this doesn’t validate salted hashing behavior. Use the same password for both users and keep the NotEqual assertion.

Proposed refactor
 t.Run("multiple users: each has unique hash", func(t *testing.T) {
 	users := []database.MCPServiceUser{
-		{Username: "x", Password: "passX"},
-		{Username: "y", Password: "passY"},
+		{Username: "x", Password: "same-pass"},
+		{Username: "y", Password: "same-pass"},
 	}
 	data, err := GenerateUserFile(users)
 	require.NoError(t, err)

 	var store mcpUserStore
 	require.NoError(t, yaml.Unmarshal(data, &store))

 	assert.NotEqual(t, store.Users["x"].PasswordHash, store.Users["y"].PasswordHash)
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/mcp_auth_files_test.go` around lines 244 -
256, The test "multiple users: each has unique hash" currently uses different
plaintexts so it doesn't validate salting; update the test to create two
MCPServiceUser entries with the same Password value (e.g., both "samePass") and
keep the assertion asserting that store.Users["x"].PasswordHash !=
store.Users["y"].PasswordHash after calling GenerateUserFile and unmarshalling
into mcpUserStore; this verifies that GenerateUserFile (and the underlying
bcrypt logic) produces unique salted hashes rather than relying on differing
inputs.
server/internal/orchestrator/swarm/mcp_config_test.go (1)

379-412: Add a mixed-provider ollama embedding test case.

Please add a case for llm_provider=anthropic|openai with embedding_provider=ollama to pin expected embedding.ollama_url behavior (present default vs omitted).

🧪 Test scaffold
+func TestGenerateMCPConfig_EmbeddingOllama_WithNonOllamaLLM(t *testing.T) {
+	embProvider := "ollama"
+	embModel := "nomic-embed-text"
+	params := &MCPConfigParams{
+		Config: &database.MCPServiceConfig{
+			LLMProvider:       "anthropic",
+			LLMModel:          "claude-sonnet-4-5",
+			AnthropicAPIKey:   strPtr("sk-ant-api03-test"),
+			EmbeddingProvider: &embProvider,
+			EmbeddingModel:    &embModel,
+		},
+		DatabaseName: "mydb",
+		DatabaseHost: "db-host",
+		DatabasePort: 5432,
+		Username:     "appuser",
+		Password:     "secret",
+	}
+
+	data, err := GenerateMCPConfig(params)
+	if err != nil {
+		t.Fatalf("GenerateMCPConfig() error = %v", err)
+	}
+	cfg := parseYAML(t, data)
+	if cfg.Embedding == nil {
+		t.Fatal("embedding section should be present")
+	}
+	// Assert decided behavior for embedding.ollama_url here.
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/mcp_config_test.go` around lines 379 -
412, Add a new test variant to TestGenerateMCPConfig_EmbeddingOllama that covers
LLM providers other than Ollama (e.g., set Config.LLMProvider to "anthropic" or
"openai") while keeping EmbeddingProvider="ollama" and OllamaURL set; call
GenerateMCPConfig with that MCPConfigParams, parse the YAML with parseYAML and
assert that cfg.Embedding is present and that cfg.Embedding.OllamaURL equals the
provided OllamaURL (and also add a case where OllamaURL is nil to assert the
field is omitted when not provided). Locate the test scaffold in
TestGenerateMCPConfig_EmbeddingOllama and the code paths in GenerateMCPConfig /
database.MCPServiceConfig / MCPConfigParams to modify/add these assertions.
server/internal/database/mcp_service_config.go (1)

112-135: Reject non-selected LLM credential keys to keep config unambiguous.

Lines 112-135 require the active provider credential, but configs can still include credentials for other providers and pass validation. That can leave contradictory/stale secrets in persisted config.

♻️ Proposed tightening
 	// Provider-specific API key cross-validation
 	var anthropicKey, openaiKey, ollamaURL *string
 	if llmProvider != "" && slices.Contains(validLLMProviders, llmProvider) {
+		if llmProvider != "anthropic" {
+			if _, ok := config["anthropic_api_key"]; ok {
+				errs = append(errs, fmt.Errorf("anthropic_api_key must not be set when llm_provider is %q", llmProvider))
+			}
+		}
+		if llmProvider != "openai" {
+			if _, ok := config["openai_api_key"]; ok {
+				errs = append(errs, fmt.Errorf("openai_api_key must not be set when llm_provider is %q", llmProvider))
+			}
+		}
+		if llmProvider != "ollama" {
+			if _, ok := config["ollama_url"]; ok {
+				errs = append(errs, fmt.Errorf("ollama_url must not be set when llm_provider is %q", llmProvider))
+			}
+		}
+
 		switch llmProvider {
 		case "anthropic":
 			key, keyErrs := requireStringForProvider(config, "anthropic_api_key", "anthropic")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/database/mcp_service_config.go` around lines 112 - 135, The
validation currently only ensures the selected provider has credentials but
allows other providers' credentials to remain present; update the logic in the
provider-specific block (referencing llmProvider, validLLMProviders and
requireStringForProvider) to also reject extraneous credentials for non-selected
providers: for each provider name (anthropic, openai, ollama) call
requireStringForProvider and if a non-selected provider returns a non-empty
value append a validation error to errs indicating an unexpected credential for
that provider, while keeping the existing checks that the selected provider has
a value (and continue using anthropicKey, openaiKey, ollamaURL as needed).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/internal/orchestrator/swarm/mcp_auth_files_test.go`:
- Around line 284-291: The tests for GenerateEmptyTokenFile and
GenerateEmptyUserFile currently use assert.Empty on store.Tokens/store.Users
which doesn't fail if the keys are missing; instead unmarshal the generated YAML
into a generic map[string]interface{} (e.g., var m map[string]interface{}),
assert that the top-level keys "tokens" and "users" exist in that map, assert
their values are maps (or slices) and that their length is zero; keep the
existing checks against mcpTokenStore/mcpUserStore if desired but add these
explicit key-existence assertions referencing GenerateEmptyTokenFile,
mcpTokenStore.Tokens, GenerateEmptyUserFile, and mcpUserStore.Users to guarantee
the keys are serialized.

---

Nitpick comments:
In `@server/internal/database/mcp_service_config.go`:
- Around line 112-135: The validation currently only ensures the selected
provider has credentials but allows other providers' credentials to remain
present; update the logic in the provider-specific block (referencing
llmProvider, validLLMProviders and requireStringForProvider) to also reject
extraneous credentials for non-selected providers: for each provider name
(anthropic, openai, ollama) call requireStringForProvider and if a non-selected
provider returns a non-empty value append a validation error to errs indicating
an unexpected credential for that provider, while keeping the existing checks
that the selected provider has a value (and continue using anthropicKey,
openaiKey, ollamaURL as needed).

In `@server/internal/orchestrator/swarm/mcp_auth_files_test.go`:
- Around line 244-256: The test "multiple users: each has unique hash" currently
uses different plaintexts so it doesn't validate salting; update the test to
create two MCPServiceUser entries with the same Password value (e.g., both
"samePass") and keep the assertion asserting that store.Users["x"].PasswordHash
!= store.Users["y"].PasswordHash after calling GenerateUserFile and
unmarshalling into mcpUserStore; this verifies that GenerateUserFile (and the
underlying bcrypt logic) produces unique salted hashes rather than relying on
differing inputs.

In `@server/internal/orchestrator/swarm/mcp_config_test.go`:
- Around line 379-412: Add a new test variant to
TestGenerateMCPConfig_EmbeddingOllama that covers LLM providers other than
Ollama (e.g., set Config.LLMProvider to "anthropic" or "openai") while keeping
EmbeddingProvider="ollama" and OllamaURL set; call GenerateMCPConfig with that
MCPConfigParams, parse the YAML with parseYAML and assert that cfg.Embedding is
present and that cfg.Embedding.OllamaURL equals the provided OllamaURL (and also
add a case where OllamaURL is nil to assert the field is omitted when not
provided). Locate the test scaffold in TestGenerateMCPConfig_EmbeddingOllama and
the code paths in GenerateMCPConfig / database.MCPServiceConfig /
MCPConfigParams to modify/add these assertions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3a23c036-84cf-4c16-bd6d-220b81df9234

📥 Commits

Reviewing files that changed from the base of the PR and between 22084dd and d9f7939.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • NOTICE.txt
  • e2e/service_provisioning_test.go
  • go.mod
  • server/internal/database/mcp_service_config.go
  • server/internal/database/mcp_service_config_test.go
  • server/internal/orchestrator/swarm/mcp_auth_files.go
  • server/internal/orchestrator/swarm/mcp_auth_files_test.go
  • server/internal/orchestrator/swarm/mcp_config.go
  • server/internal/orchestrator/swarm/mcp_config_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • server/internal/orchestrator/swarm/mcp_auth_files.go
  • e2e/service_provisioning_test.go

Replace environment-variable-based MCP service configuration with
CP-generated YAML config files (config.yaml, tokens.yaml, users.yaml)
bind-mounted into the service container at /app/data/.

config.yaml is CP-owned and regenerated on every update, carrying
database connection, LLM provider, tool toggles, and pool settings.
tokens.yaml and users.yaml are bootstrap-once: written on initial
create if init_token/init_users are provided, then left for the
running MCP server to manage. This separates CP provisioning concerns
from application-level auth management.

Key changes:
  - Add MCPServiceConfig typed parser with validation for all supported
    config keys (LLM, embedding, pool, tool toggles, bootstrap auth)
  - Add MCPConfigResource lifecycle (Create writes all three files,
    Update regenerates only config.yaml, Refresh verifies file exists)
  - Add token/user file generators with SHA-256 and bcrypt hashing
  - Wire MCPConfigResource into the service instance resource chain
    between DirResource and ServiceInstanceSpec
  - Resolve database host/port from co-located Postgres instance in
    plan_update for the config.yaml connection block
  - Redact sensitive config keys (API keys, init_users) from API
    responses
  - Align service user grants with MCP server security guide: public
    schema only, pg_read_all_settings, no spock access
  - Add TestUpdateMCPServiceConfig E2E test for config update path

  PLAT-444
ran "go mod tidy" && "make licenses"
@rshoemaker rshoemaker force-pushed the feat/PLAT-444/mcp_service_config branch from 82e9b72 to 0986d73 Compare March 5, 2026 19:13
@rshoemaker rshoemaker merged commit 9b45dfc into main Mar 5, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants