- Notifications
You must be signed in to change notification settings - Fork 132
Add MCPRemoteProxy CRD controller for remote MCP server proxying #2226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements the MCPRemoteProxy custom resource to enable Kubernetes-based proxying of remote MCP servers with comprehensive features including authentication, authorization, audit logging, and tool filtering. This provides organizations with centralized visibility and policy enforcement for employee interactions with external SaaS MCP services.
Key changes:
- Adds MCPRemoteProxy CRD with comprehensive validation and status tracking
- Implements complete controller logic with refactored, maintainable code structure
- Includes shared helper functions for common operations across controllers
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
docs/operator/crd-api.md | Adds comprehensive API documentation for MCPRemoteProxy CRD |
deploy/charts/operator/templates/clusterrole/role.yaml | Extends RBAC permissions for MCPRemoteProxy resources |
deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcpremoteproxies.yaml | Complete CRD manifest with validation schema |
cmd/thv-operator/main.go | Registers MCPRemoteProxy controller with manager |
cmd/thv-operator/controllers/mcpremoteproxy_* | Core controller implementation with deployment, service, and RunConfig management |
cmd/thv-operator/controllers/common_helpers.go | Shared helper functions for platform detection, RBAC, and configuration management |
cmd/thv-operator/api/v1alpha1/mcpremoteproxy_types.go | Type definitions for MCPRemoteProxy with comprehensive field validation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
deploy/charts/operator-crds/crds/toolhive.stacklok.dev_mcpremoteproxies.yaml Outdated Show resolved Hide resolved
@claude Please review this pull request. Ensure that it adheres to kubernetes best practices and that it has good security measures in mind. You can find more context in docs/proposals/THV-2151-remote-mcp-proxy.md |
Claude finished @JAORMX's task —— View job PR Review: MCPRemoteProxy CRD and ControllerI've completed a comprehensive security and Kubernetes best practices review of this PR. Overall, this is a well-structured implementation with strong security foundations. Below are my findings organized by category. ✅ StrengthsSecurity Architecture
Kubernetes Best Practices
Code Quality
🔒 Security FindingsCritical IssuesNone found - No critical security vulnerabilities identified. Medium Priority Issues1. OIDC Configuration Validation Gap (cmd/thv-operator/api/v1alpha1/mcpremoteproxy_types.go:28)Issue: While Risk: Proxy could be deployed without authentication, allowing unauthenticated access to remote MCP servers. Recommendation: // Change line 27 from +optional to: // +kubebuilder:validation:Required OIDCConfig *OIDCConfigRef `json:"oidcConfig"` The controller validation at line 288 catches this, but CRD-level validation prevents invalid resources from being created in the first place. 2. Secret Key Existence Validation (cmd/thv-operator/controllers/mcpremoteproxy_runconfig.go:307)Current behavior: Validates secret key existence during RunConfig creation. Improvement: While this works, consider adding a webhook validator to fail faster at resource creation time, providing better UX. Low Priority Issues3. Typo in CRD Description (Multiple files)Files:
Issue: Comment says "mcprremoteproxies" (double 'r') instead of "mcpremoteproxies" Fix: Correct the spelling in all three locations. 4. Code Comment Clarity (cmd/thv-operator/controllers/mcpremoteproxy_runconfig.go:199)The comment "Deployer is set to nil" could be clearer about why this is appropriate for remote proxies. Suggested improvement: // Use the RunConfigBuilder for operator context. // Deployer is nil for remote proxies because they connect to external services // and do not require container deployment configuration (unlike MCPServer which deploys containers). 5. Code Duplication Workaround (cmd/thv-operator/controllers/mcpremoteproxy_runconfig.go:216)The comment at line 216 acknowledges creating temporary MCPServer objects as "hacky." While functional, consider: Future refactoring option: Extract common authz config logic into a shared function that accepts the base This is acceptable technical debt for now but should be noted for future cleanup. 🏗️ Kubernetes Best Practices ReviewExcellent Practices ObservedResource Management
Observability
RBAC ConfigurationThe ClusterRole (deploy/charts/operator/templates/clusterrole/role.yaml) follows least privilege:
Minor Improvements1. Hardcoded Replica Count (cmd/thv-operator/controllers/mcpremoteproxy_deployment.go:24)Current: Replicas hardcoded to 1 Consideration: The proposal mentions "horizontally scalable architecture," but the CRD spec doesn't include a Recommendation: Either:
2. Service Type FlexibilityCurrent: Service type is not configurable (defaults to ClusterIP) Consideration: Organizations may want LoadBalancer or NodePort for external access. Recommendation: Consider adding 3. Resource Limits Best PracticeObservation: Recommendation: Consider adding default resource requests in the CRD defaults: +kubebuilder:default={"requests":{"cpu":"100m","memory":"128Mi"}} This ensures resources are scheduled properly even when users don't specify requirements. 📋 Test Coverage AssessmentExcellent CoverageThe PR includes comprehensive unit tests:
Test Quality Observations
Suggested Additional TestsWhile coverage is good, consider adding:
🎯 Alignment with ProposalI reviewed against ✅ Fully Implemented
|
Note that I added cmd/thv-operator/controllers/common_helpers.go which will be shared with the MCServer reconciler . It's more verbose but it will be re-used later. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@ ## main #2226 +/- ## ========================================== + Coverage 53.32% 53.88% +0.55% ========================================== Files 231 234 +3 Lines 29529 30472 +943 ========================================== + Hits 15747 16420 +673 - Misses 12649 12852 +203 - Partials 1133 1200 +67 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
3e00c1b
to 00a63f3
Compare @claude please do another kubernetes + security focused review of this. |
Claude finished @JAORMX's task —— View job Kubernetes + Security Review of MCPRemoteProxyI've completed a comprehensive security and Kubernetes best practices review of the MCPRemoteProxy implementation. This is a well-designed and secure implementation that follows Kubernetes best practices and implements proper security controls. Executive SummaryOverall Assessment: ✅ APPROVED This implementation demonstrates:
Recommendation: Approve for merge with minor documentation clarifications (non-blocking). Security Review✅ Strengths1. Authentication Architecture (Excellent)
2. Secret Management (Strong)
Security validation example (mcpremoteproxy_runconfig.go:276-288): // Validates that the referenced Kubernetes secret exists and contains the required key var secret corev1.Secret if err := r.Get(ctx, types.NamespacedName{ Namespace: proxy.Namespace, Name: tokenExchangeSpec.ClientSecretRef.Name, }, &secret); err != nil { return fmt.Errorf("failed to get client secret: %w", err) } if _, ok := secret.Data[tokenExchangeSpec.ClientSecretRef.Key]; !ok { return fmt.Errorf("secret is missing key %q", tokenExchangeSpec.ClientSecretRef.Key) } 3. Security Contexts (Excellent)
Security context implementation (security.go:54-86) correctly adapts to platform:
4. RBAC Configuration (Principle of Least Privilege)Operator-level RBAC (role.yaml:100-137):
Proxy runner RBAC (mcpremoteproxy_controller.go:354): Rules: defaultRBACRules Uses the same minimal permissions as MCPServer (statefulsets, pods, services - needed for in-cluster MCP server management). Question for consideration: Does the remote proxy actually need these permissions? Remote proxies don't deploy MCP server containers - they only connect to external services. The RBAC rules include permissions for statefulsets/pods that may not be needed for remote proxies. Recommendation: Consider creating 5. Network Security Considerations
Documentation gap: The proposal mentions network policies (THV-2151:487-493) but doesn't include example manifests. Consider adding:
6. Input Validation (Multi-layer)
🔒 Security FindingsHigh Priority (Non-blocking, but recommended)Finding 1: RBAC Overpermissioning for Remote Proxies
Finding 2: Missing Finalizer for Resource Cleanup
Medium Priority (Documentation)Finding 3: Audit Log Destination Not Documented
Kubernetes Best Practices Review✅ Excellent Practices Observed1. Controller Pattern Implementation
2. Resource Change Detection
Example implementation (mcpremoteproxy_runconfig.go:87-91): currentChecksum := current.Annotations["toolhive.stacklok.dev/content-checksum"] desiredChecksum := desired.Annotations["toolhive.stacklok.dev/content-checksum"] if currentChecksum != desiredChecksum { // Only update if content changed } 3. Health and Observability
4. Code Organization
5. Resource Management
🎯 Observations and RecommendationsObservation 1: Hardcoded Replica CountLocation: mcpremoteproxy_deployment.go:24 replicas := int32(1) Context: The proposal mentions "horizontally scalable architecture" (THV-2151:7, 65) Current behavior: Controller doesn't manage replicas after initial creation (mcpremoteproxy_controller.go:200-201): // Deployment exists - do not manage replica count to allow HPA/external scaling Assessment: ✅ This is correct! The controller intentionally doesn't manage replicas after creation to allow HPA (HorizontalPodAutoscaler) to scale independently. Recommendation: Consider documenting this design decision in the CRD comments: // Replicas are initially set to 1. For horizontal scaling, use a HorizontalPodAutoscaler // rather than modifying the Deployment directly, as the controller does not manage // replica count after initial creation. Observation 2: Status Field Not PopulatedLocation: mcpremoteproxy_types.go:82-83 // ExternalURL is the external URL where the proxy can be accessed (if exposed externally) ExternalURL string `json:"externalURL,omitempty"` Issue: This field is never populated by the controller Assessment: This is likely out of scope for the controller (external URLs would come from Ingress/Route resources managed separately) Recommendation: Either:
Observation 3: Condition Type Defined But Not UsedLocation: mcpremoteproxy_types.go:129-130 // ConditionTypeRemoteAvailable indicates whether the remote MCP server is reachable ConditionTypeRemoteAvailable = "RemoteAvailable" Issue: This condition type is defined but never set by the controller Assessment: Health checking the remote MCP is mentioned in the proposal (THV-2151:311-312) but not implemented Recommendation:
Observation 4: Transport Type FlexibilityLocation: mcpremoteproxy_types.go:20-23 // Transport is the transport method for the remote proxy (sse or streamable-http) // +kubebuilder:validation:Enum=sse;streamable-http // +kubebuilder:default=streamable-http Transport string `json:"transport,omitempty"` Assessment: ✅ Excellent! Allows users to choose transport type while defaulting to the most efficient option. Validation: Properly enforced in RunConfig validation (mcpremoteproxy_runconfig.go:348-351) Test Coverage Assessment✅ Comprehensive Test SuiteTest files reviewed:
Total: 3,046 lines of test code for the remote proxy feature Test Quality Observations✅ Excellent coverage of:
✅ Test patterns:
Codecov report shows: 71.28% patch coverage with 338 lines missing coverage Assessment: This is reasonable coverage for an initial implementation. The missing lines are primarily:
Recommendation: The test coverage is sufficient for merge. Consider adding integration tests in future PRs:
Alignment with Proposal (THV-2151)✅ Fully Implemented Features
|
Threat | Mitigation | Assessment |
---|---|---|
Token interception | HTTPS required | ✅ URL pattern enforces https:// |
Spoofed identity | JWT signature validation | ✅ OIDC JWKS validation implemented |
Policy bypass | Middleware chain | ✅ All requests through middleware |
Proxy compromise | No token storage | ✅ Stateless architecture |
Invalid JWT claims | Signature validation | ✅ JWKS prevents tampering |
JWKS endpoint compromise | Cert validation, refresh | ✅ Standard OIDC validation |
Assessment: All identified threats are properly mitigated.
Additional consideration: Supply chain security for the thv-proxyrunner
image
- ✅ Image is built from known source (getToolhiveRunnerImage())
- ✅ Security contexts prevent container breakout
- Recommendation: Document image provenance and verification process
Final Recommendations
Must Address (Non-blocking for merge)
- ✅ None - No critical issues found
Should Address (High Priority)
- RBAC Overpermissioning: Consider creating separate
remoteProxyRBACRules
with minimal permissions - Documentation: Add audit log destination documentation
- NetworkPolicy Examples: Provide example manifests for egress policies
Nice to Have (Future Work)
- Health Checking: Implement RemoteAvailable condition with periodic health checks
- ExternalURL Population: Document or implement external URL tracking
- Example Manifests: Add complete example manifests to docs/examples/
- Troubleshooting Guide: Document common failure scenarios and resolution steps
Conclusion
This is a high-quality, production-ready implementation that demonstrates:
Security: ✅ EXCELLENT
- No critical security vulnerabilities
- Strong defense-in-depth architecture
- Proper secret handling and RBAC
- Platform-aware security contexts
Kubernetes Best Practices: ✅ EXCELLENT
- Proper controller patterns
- Efficient resource management
- Comprehensive test coverage
- Clean code organization
Architecture: ✅ SOUND
- Stateless, horizontally scalable
- Proper separation of concerns
- Reusable shared helpers
- Clear extension points
RECOMMENDATION: APPROVE for merge
The identified issues are minor and mostly documentation-related. The core implementation is secure, well-tested, and follows Kubernetes best practices. The suggestions above are enhancements that can be addressed in follow-up PRs.
Great work on this implementation! 🎉
Job: https://github.com/stacklok/toolhive/actions/runs/18557234390
Branch: feature/mcpremoteproxy-implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…dPlatformDetector This extracts common helper functions from the large MCPRemoteProxy PR (#2226) into a shared common_helpers.go file that can be used by multiple controllers. Changes: - Add common_helpers.go with shared helper functions: - SharedPlatformDetector for platform detection across controllers - RBAC resource creation helpers (EnsureRBACResource, createRBACResource) - OpenTelemetry environment variable generation - Token exchange environment variable generation - Authorization volume configuration (GenerateAuthzVolumeConfig) - Authorization ConfigMap management (EnsureAuthzConfigMap) - Authorization config options for RunConfig (AddAuthzConfigOptions) - GetExternalAuthConfigByName for fetching external auth configs - Refactor MCPServerReconciler to use SharedPlatformDetector: - Replace individual platformDetector, detectedPlatform, and platformOnce fields with single PlatformDetector field - Update detectPlatform() to use shared detector - Initialize SharedPlatformDetector in main.go - Update deploymentForMCPServer to use local detectedPlatform variable - Add test helpers and fixtures: - newTestMCPServerReconciler() for test setup - newTestMCPServerReconcilerWithDetector() for custom detector injection - NewSharedPlatformDetectorWithDetector() for testing with mock detectors - Update all test files to use new helper functions - Fix imports in all test files to include kubernetes package This is the first step in adopting the common helpers from PR #2226. Additional refactoring will follow to adopt the remaining helper functions. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This completes the adoption of common helper functions from PR #2226, removing duplication between the MCPServer controller and the upcoming MCPRemoteProxy controller. Changes: - Replace private generateOpenTelemetryEnvVars with common GenerateOpenTelemetryEnvVars - Replace private generateTokenExchangeEnvVars with common GenerateTokenExchangeEnvVars - Replace private generateAuthzVolumeConfig with common GenerateAuthzVolumeConfig - Replace private ensureAuthzConfigMap with common EnsureAuthzConfigMap - Replace private addAuthzConfigOptions with common AddAuthzConfigOptions - Update all test files to use common helpers directly - Remove unused imports (yaml, authz) from mcpserver_runconfig.go All functions are now shared and reusable across multiple controllers, reducing code duplication and improving maintainability. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…dPlatformDetector This extracts common helper functions from the large MCPRemoteProxy PR (#2226) into a shared common_helpers.go file that can be used by multiple controllers. Changes: - Add common_helpers.go with shared helper functions: - SharedPlatformDetector for platform detection across controllers - RBAC resource creation helpers (EnsureRBACResource, createRBACResource) - OpenTelemetry environment variable generation - Token exchange environment variable generation - Authorization volume configuration (GenerateAuthzVolumeConfig) - Authorization ConfigMap management (EnsureAuthzConfigMap) - Authorization config options for RunConfig (AddAuthzConfigOptions) - GetExternalAuthConfigByName for fetching external auth configs - Refactor MCPServerReconciler to use SharedPlatformDetector: - Replace individual platformDetector, detectedPlatform, and platformOnce fields with single PlatformDetector field - Update detectPlatform() to use shared detector - Initialize SharedPlatformDetector in main.go - Update deploymentForMCPServer to use local detectedPlatform variable - Add test helpers and fixtures: - newTestMCPServerReconciler() for test setup - newTestMCPServerReconcilerWithDetector() for custom detector injection - NewSharedPlatformDetectorWithDetector() for testing with mock detectors - Update all test files to use new helper functions - Fix imports in all test files to include kubernetes package This is the first step in adopting the common helpers from PR #2226. Additional refactoring will follow to adopt the remaining helper functions. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This completes the adoption of common helper functions from PR #2226, removing duplication between the MCPServer controller and the upcoming MCPRemoteProxy controller. Changes: - Replace private generateOpenTelemetryEnvVars with common GenerateOpenTelemetryEnvVars - Replace private generateTokenExchangeEnvVars with common GenerateTokenExchangeEnvVars - Replace private generateAuthzVolumeConfig with common GenerateAuthzVolumeConfig - Replace private ensureAuthzConfigMap with common EnsureAuthzConfigMap - Replace private addAuthzConfigOptions with common AddAuthzConfigOptions - Update all test files to use common helpers directly - Remove unused imports (yaml, authz) from mcpserver_runconfig.go All functions are now shared and reusable across multiple controllers, reducing code duplication and improving maintainability. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…dPlatformDetector This extracts common helper functions from the large MCPRemoteProxy PR (#2226) into a shared common_helpers.go file that can be used by multiple controllers. Changes: - Add common_helpers.go with shared helper functions: - SharedPlatformDetector for platform detection across controllers - RBAC resource creation helpers (EnsureRBACResource, createRBACResource) - OpenTelemetry environment variable generation - Token exchange environment variable generation - Authorization volume configuration (GenerateAuthzVolumeConfig) - Authorization ConfigMap management (EnsureAuthzConfigMap) - Authorization config options for RunConfig (AddAuthzConfigOptions) - GetExternalAuthConfigByName for fetching external auth configs - Refactor MCPServerReconciler to use SharedPlatformDetector: - Replace individual platformDetector, detectedPlatform, and platformOnce fields with single PlatformDetector field - Update detectPlatform() to use shared detector - Initialize SharedPlatformDetector in main.go - Update deploymentForMCPServer to use local detectedPlatform variable - Add test helpers and fixtures: - newTestMCPServerReconciler() for test setup - newTestMCPServerReconcilerWithDetector() for custom detector injection - NewSharedPlatformDetectorWithDetector() for testing with mock detectors - Update all test files to use new helper functions - Fix imports in all test files to include kubernetes package This is the first step in adopting the common helpers from PR #2226. Additional refactoring will follow to adopt the remaining helper functions. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This completes the adoption of common helper functions from PR #2226, removing duplication between the MCPServer controller and the upcoming MCPRemoteProxy controller. Changes: - Replace private generateOpenTelemetryEnvVars with common GenerateOpenTelemetryEnvVars - Replace private generateTokenExchangeEnvVars with common GenerateTokenExchangeEnvVars - Replace private generateAuthzVolumeConfig with common GenerateAuthzVolumeConfig - Replace private ensureAuthzConfigMap with common EnsureAuthzConfigMap - Replace private addAuthzConfigOptions with common AddAuthzConfigOptions - Update all test files to use common helpers directly - Remove unused imports (yaml, authz) from mcpserver_runconfig.go All functions are now shared and reusable across multiple controllers, reducing code duplication and improving maintainability. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
* Extract common controller helpers and refactor MCPServer to use SharedPlatformDetector This extracts common helper functions from the large MCPRemoteProxy PR (#2226) into a shared common_helpers.go file that can be used by multiple controllers. Changes: - Add common_helpers.go with shared helper functions: - SharedPlatformDetector for platform detection across controllers - RBAC resource creation helpers (EnsureRBACResource, createRBACResource) - OpenTelemetry environment variable generation - Token exchange environment variable generation - Authorization volume configuration (GenerateAuthzVolumeConfig) - Authorization ConfigMap management (EnsureAuthzConfigMap) - Authorization config options for RunConfig (AddAuthzConfigOptions) - GetExternalAuthConfigByName for fetching external auth configs - Refactor MCPServerReconciler to use SharedPlatformDetector: - Replace individual platformDetector, detectedPlatform, and platformOnce fields with single PlatformDetector field - Update detectPlatform() to use shared detector - Initialize SharedPlatformDetector in main.go - Update deploymentForMCPServer to use local detectedPlatform variable - Add test helpers and fixtures: - newTestMCPServerReconciler() for test setup - newTestMCPServerReconcilerWithDetector() for custom detector injection - NewSharedPlatformDetectorWithDetector() for testing with mock detectors - Update all test files to use new helper functions - Fix imports in all test files to include kubernetes package This is the first step in adopting the common helpers from PR #2226. Additional refactoring will follow to adopt the remaining helper functions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) * Refactor MCPServer controller to use common helper functions This completes the adoption of common helper functions from PR #2226, removing duplication between the MCPServer controller and the upcoming MCPRemoteProxy controller. Changes: - Replace private generateOpenTelemetryEnvVars with common GenerateOpenTelemetryEnvVars - Replace private generateTokenExchangeEnvVars with common GenerateTokenExchangeEnvVars - Replace private generateAuthzVolumeConfig with common GenerateAuthzVolumeConfig - Replace private ensureAuthzConfigMap with common EnsureAuthzConfigMap - Replace private addAuthzConfigOptions with common AddAuthzConfigOptions - Update all test files to use common helpers directly - Remove unused imports (yaml, authz) from mcpserver_runconfig.go All functions are now shared and reusable across multiple controllers, reducing code duplication and improving maintainability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) * Fix linting issues in common helpers refactoring - Break long lines for GenerateTokenExchangeEnvVars calls - Break long line for EnsureAuthzConfigMap call - Rename mcpserver_test_helpers.go to mcpserver_test_helpers_test.go - Remove duplicate mockPlatformDetector from mcpserver_platform_test.go - Fix import shadowing by renaming client parameter to k8sClient - Add nolint:unparam for platform parameter - Remove unused newTestMCPServerReconcilerWithDetector function 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix missing ConfigMap update logic in common helpers During the refactoring to extract common helpers, the ConfigMap update logic was accidentally omitted from EnsureAuthzConfigMap. This caused authorization policy updates to be ignored after initial ConfigMap creation, breaking a core feature. This commit restores the update logic that checks for data changes using reflect.DeepEqual and updates the ConfigMap when needed. Also replaces hardcoded "authz.json" string with the defaultAuthzKey constant for consistency. Changes: - Add reflect import to common_helpers.go - Restore ConfigMap update logic in EnsureAuthzConfigMap (lines 351-362) - Replace hardcoded string with defaultAuthzKey constant (line 430) Addresses critical issues identified in PR review. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add test for ConfigMap update scenario in EnsureAuthzConfigMap Adds integration test that verifies ConfigMap updates when inline authorization policies change. This test ensures the fix for the missing update logic works correctly. Test scenario: 1. Create MCPServer with initial inline authz config 2. Verify ConfigMap is created with initial policies 3. Update MCPServer with different policies and entities 4. Verify ConfigMap is updated with new data This addresses the "Recommended Change" from the PR review to add integration test coverage for the ConfigMap update logic. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
8a60dd1
to 0199032
Compare 0199032
to 24ad878
Compare @claude please review this with a security, kubernetes, and MCP expertise in mind. |
Claude finished @JAORMX's task —— View job Security, Kubernetes, and MCP Comprehensive ReviewI've completed an in-depth security, Kubernetes best practices, and MCP protocol review of the MCPRemoteProxy implementation. This is a well-architected, secure, and production-ready implementation that demonstrates strong engineering principles. Executive SummaryOverall Assessment: ✅ APPROVED FOR MERGE This implementation excels in:
Recommendation: Approve with one suggestion for future enhancement (RBAC scope reduction for remote proxies). Security Assessment ✅Strengths1. Authentication Architecture (Excellent)Token validation without storage (mcpremoteproxy_runconfig.go:340-341):
OIDC configuration (mcpremoteproxy_types.go:27-28): // +kubebuilder:validation:Required OIDCConfig OIDCConfigRef `json:"oidcConfig"` This ensures all remote proxies must have authentication configured at the API level. Token exchange support (mcpremoteproxy_runconfig.go:253-334):
2. Secret Management (Strong)Kubernetes secret references (mcpremoteproxy_runconfig.go:277-290): // Validates that the referenced Kubernetes secret exists and contains the required key var secret corev1.Secret if err := r.Get(ctx, types.NamespacedName{ Namespace: proxy.Namespace, Name: tokenExchangeSpec.ClientSecretRef.Name, }, &secret); err != nil { return fmt.Errorf("failed to get client secret: %w", err) } ✅ Security controls:
3. Security Contexts (Excellent)Platform-aware security (security.go:54-86):
Platform detection (common_helpers.go:60-86): func (s *SharedPlatformDetector) DetectPlatform(ctx context.Context) (kubernetes.Platform, error) { s.once.Do(func() { // Detect once and cache result s.detectedPlatform, err = s.detector.DetectPlatform(cfg) }) return s.detectedPlatform, err } 4. Input Validation (Multi-layer)Three levels of validation:
5. RBAC Configuration (Principle of Least Privilege)Operator-level RBAC (role.yaml:100-137):
Proxy runner RBAC (mcpremoteproxy_controller.go:343-354): Rules: defaultRBACRules
Recommendation for future enhancement (non-blocking): var remoteProxyRBACRules = []rbacv1.PolicyRule{ { APIGroups: []string{""}, Resources: []string{"configmaps"}, Verbs: []string{"get", "list", "watch"}, }, { APIGroups: []string{""}, Resources: []string{"secrets"}, Verbs: []string{"get", "list", "watch"}, }, } This would further reduce the attack surface, though the current approach is acceptable since:
Security Threat Model AssessmentBased on the proposal's threat model (THV-2151:476-486):
Additional security considerations:
Missing documentation (non-blocking):
Kubernetes Best Practices Review ✅Excellent Practices Observed1. Controller Pattern ImplementationIdempotent reconciliation (mcpremoteproxy_controller.go:48-80):
2. Resource Change DetectionContent checksums (mcpremoteproxy_runconfig.go:52-55): checksumCalculator := configMapChecksum.NewRunConfigConfigMapChecksum() checksum := checksumCalculator.ComputeConfigMapChecksum(configMap) configMap.Annotations = map[string]string{ "toolhive.stacklok.dev/content-checksum": checksum, } Hash tracking in status (mcpremoteproxy_types.go:94-99):
3. Health and ObservabilityHealth probes (mcpremoteproxy_deployment.go:62-63):
OpenTelemetry support (common_helpers.go:139-164): envVars = append(envVars, corev1.EnvVar{ Name: "OTEL_RESOURCE_ATTRIBUTES", Value: fmt.Sprintf("service.name=%s,service.namespace=%s", serviceName, namespace), }) Status conditions (mcpremoteproxy_types.go:125-158):
4. Code Organization (Excellent)Separation of concerns:
Clean builder pattern (mcpremoteproxy_deployment.go:26-33): args := r.buildContainerArgs() volumeMounts, volumes := r.buildVolumesForProxy(proxy) env := r.buildEnvVarsForProxy(ctx, proxy) resources := buildResourceRequirements(proxy.Spec.Resources) 5. Resource ManagementResource requirements (mcpremoteproxy_deployment.go:256-280):
Replica management (mcpremoteproxy_controller.go:200-201): // Deployment exists - do not manage replica count to allow HPA/external scaling ✅ Correct: Controller doesn't manage replicas after creation, allowing HPA to scale independently. Architecture Observations1. Hardcoded Initial Replica CountCurrent (mcpremoteproxy_deployment.go:24): replicas := int32(1) Context: The proposal mentions "horizontally scalable architecture" (THV-2151:7, 65). Assessment: ✅ This is correct! The controller intentionally doesn't manage replicas after creation to allow HPA (HorizontalPodAutoscaler) to scale independently. Documentation suggestion: Add a comment to clarify design decision: // Replicas are initially set to 1. For horizontal scaling, use a HorizontalPodAutoscaler // rather than modifying the Deployment directly, as the controller does not manage // replica count after initial creation. 2. Status Fields Not PopulatedExternalURL field (mcpremoteproxy_types.go:82-83): // ExternalURL is the external URL where the proxy can be accessed (if exposed externally) ExternalURL string `json:"externalURL,omitempty"` Issue: Never populated by controller. Assessment: External URLs come from Ingress/Route resources managed separately. Consider either:
3. Condition Type Defined But Not SetRemoteAvailable condition (mcpremoteproxy_types.go:129-130): // ConditionTypeRemoteAvailable indicates whether the remote MCP server is reachable ConditionTypeRemoteAvailable = "RemoteAvailable" Issue: Condition type defined but never set by controller. Assessment: Health checking the remote MCP is mentioned in proposal but not implemented yet. Consider:
MCP Protocol Integration ✅Transport Type SelectionConfiguration (mcpremoteproxy_types.go:20-23): // +kubebuilder:validation:Enum=sse;streamable-http // +kubebuilder:default=streamable-http Transport string `json:"transport,omitempty"` ✅ Excellent:
Middleware ConfigurationCorrect middleware ordering (Proposal THV-2151:445-451):
Implementation (mcpremoteproxy_runconfig.go:188-198):
RunConfig GenerationKey difference from MCPServer (mcpremoteproxy_runconfig.go:114-115): // createRunConfigFromMCPRemoteProxy converts MCPRemoteProxy spec to RunConfig // Key difference from MCPServer: Sets RemoteURL instead of Image, and Deployer remains nil Implementation (mcpremoteproxy_runconfig.go:162-163): // Key: Set RemoteURL instead of Image runner.WithRemoteURL(proxy.Spec.RemoteURL), ✅ Correct: Remote proxies don't deploy containers, so Deployer is nil. Placeholder Argument HandlingContainer args (mcpremoteproxy_deployment.go:82-86): func (*MCPRemoteProxyReconciler) buildContainerArgs() []string { // The third argument is required by proxyrunner command signature but is ignored // when RemoteURL is set (HTTPTransport.Setup returns early for remote servers) return []string{"run", "--foreground=true", "placeholder-for-remote-proxy"} } ✅ Well-documented: Explains why placeholder is needed and that it's safe (not used for remote servers). Test Coverage Assessment ✅Comprehensive Test SuiteTest files:
Total: 3,046 lines of test code Test Quality✅ Excellent coverage of:
✅ Test patterns:
Codecov report: 70.48% patch coverage (276 lines missing coverage) Assessment: This is good coverage for an initial implementation. Missing lines are primarily error handling paths and edge cases. Future testing recommendations (non-blocking):
Alignment with Proposal (THV-2151) ✅✅ Fully Implemented Features
|
e8bbbe6
to b3c7f96
Compare 9a5e089
to fe868ff
Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't spot anything that screamed out at me in the MCPRemoteProxy
specific code, but I noticed there's quite a lot of shared code now between controllers that we're seeming to duplicate in this PR.
I also think that we'd get more from refactoring some pieces into separate modules to be reused and more finely tested. Not saying this PR should do them, but we should definitely start doing that now before we end up in a place where we have a bunch of code that looks the same but has a slight different in signatures.
I also think that having some integration tests for this PR would be invaluable (thv-operator/test-integration/mcp-remote-proxy/
. Even if it's the basic cases like it is with the mcp-server
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Duplication Analysis
This PR adds significant functionality for MCPRemoteProxy. During review, several patterns of code duplication were identified between the new MCPRemoteProxy controller and the existing MCPServer controller. These duplications range from small utility functions to substantial logic blocks (80+ lines).
Summary of Findings
High-Priority Duplications:
- RunConfig ConfigMap management (~49 lines) - existing shared package not utilized
- External auth configuration processing (~82 lines) - 99% identical between controllers
- OIDC configuration processing (~30 lines) - 90% identical
- Multiple utility functions (naming, resource parsing) - 100% identical
Recommended Actions:
- Use existing
pkg/runconfig/configmap
package for ConfigMap management - Extract common authentication logic to
common_helpers.go
- Consolidate identical utility functions
These refactorings would eliminate ~400-500 lines of duplicated code while improving maintainability. The duplications are documented in the inline comments below with specific suggestions for refactoring.
Impact
- Code reduction: ~400-500 lines
- Maintainability: Single source of truth for shared logic
- Bug risk: Reduces surface area for divergence
The functionality is well-implemented and thoroughly tested. These suggestions focus on code organization rather than correctness.
Implements the MCPRemoteProxy custom resource to enable Kubernetes-based proxying of remote MCP servers with authentication, authorization, audit logging, and tool filtering. This brings feature parity with the existing CLI 'thv run <remote-url>' capability to the Kubernetes operator. Organizations can now deploy centralized remote MCP proxies in Kubernetes to gain visibility, policy enforcement, and audit trails for employee interactions with external SaaS MCP services (e.g., Salesforce MCP, GitHub MCP). Key features: - Validates incoming OAuth tokens (JWT or introspection) - Extracts user identity from validated tokens for policy enforcement - Applies existing ToolHive middleware (authz, audit, tool filtering, telemetry) - Optional token exchange when IDP federation exists - Stateless, horizontally scalable architecture - User-configurable transport (SSE or streamable-http) Implementation includes: - MCPRemoteProxy CRD with comprehensive validation - Controller with refactored, maintainable code structure - Common helpers shared with MCPServer controller - Comprehensive unit tests (20+ test functions, 60+ test cases) - Generated CRD manifests and API documentation Related proposal: docs/proposals/THV-2151-remote-mcp-proxy.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
Updates chainsaw E2E test assertions to include MCPRemoteProxy RBAC permissions, and bumps operator chart version to 0.2.24 for the new controller and RBAC changes. Changes: - Add mcpremoteproxies RBAC assertions to multi-tenancy test - Add mcpremoteproxies RBAC assertions to single-tenancy test - Bump operator chart version from 0.2.23 to 0.2.24 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
MCPRemoteProxy controller does not use finalizers, so we shouldn't expect the RBAC permission. Only add RBAC permissions for what we actually use. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…issions This commit addresses PR feedback from jhrozek and ChrisJBurns by eliminating code duplication between MCPServer and MCPRemoteProxy controllers, and implements security improvements by creating minimal RBAC rules for remote proxies. Changes: - Add shared AddExternalAuthConfigOptions and AddOIDCConfigOptions to common_helpers.go - Create remoteProxyRBACRules with minimal permissions (ConfigMaps and Secrets read-only) - Update MCPServer and MCPRemoteProxy controllers to use shared authentication functions - Remove ~240 lines of duplicated code across controllers - Update tests to use new shared functions Security improvements: - Remote proxies now use minimal RBAC permissions (no StatefulSet/Pod access) - Follows principle of least privilege for remote proxy service accounts The refactoring maintains all existing functionality while improving code maintainability and reducing the attack surface for remote proxy deployments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
fe868ff
to 58b53eb
Compare Feedback AddressedI've addressed all the code duplication concerns raised by @jhrozek and @ChrisJBurns in commit 58b53eb. Changes Made1. Code Duplication Eliminated ✅
2. Security Improvements 🔒
3. Quality Checks ✅
Files Modified
The refactoring maintains all existing functionality while significantly improving code maintainability and reducing the attack surface for remote proxy deployments. @ChrisJBurns @jhrozek - Ready for re-review! 🚀 |
This commit addresses remaining code duplication identified in PR review and implements critical missing functionality for MCPRemoteProxy controller. Code Deduplication (~150 lines eliminated): - Move utility functions to common_helpers.go: * BuildResourceRequirements, BuildHealthProbe, EnsureRequiredEnvVars * MergeLabels, MergeAnnotations * CreateProxyServiceName, CreateProxyServiceURL, ProxyRunnerServiceAccountName - Update MCPServer and MCPRemoteProxy to use shared functions - Remove duplicated code from both controllers Critical Functionality Added: - Implement deployment update detection for MCPRemoteProxy (was missing!) - Implement service update detection - Deployments now update when spec changes (port, resources, env vars, etc.) Naming Improvements: - Add "remote-" prefix to MCPRemoteProxy resources to prevent conflicts: * Service: mcp-{name}-remote-proxy (vs mcp-{name}-proxy for MCPServer) * ServiceAccount: {name}-remote-proxy-runner - Prevents resource collisions when MCPServer and MCPRemoteProxy have same name Design Decision: - MCPRemoteProxy relies on owner references for cleanup (no finalizers) - Simpler than MCPServer approach, adequate for remote proxy use case All tests pass with updated expectations for new naming scheme. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implements the MCPRemoteProxy custom resource to enable Kubernetes-based
proxying of remote MCP servers with authentication, authorization, audit
logging, and tool filtering. This brings feature parity with the existing
CLI 'thv run ' capability to the Kubernetes operator.
Organizations can now deploy centralized remote MCP proxies in Kubernetes
to gain visibility, policy enforcement, and audit trails for employee
interactions with external SaaS MCP services (e.g., Salesforce MCP, GitHub MCP).
Key features:
Implementation includes:
Related proposal: docs/proposals/THV-2151-remote-mcp-proxy.md
🤖 Generated with Claude Code