- Notifications
You must be signed in to change notification settings - Fork 158
Integrate health monitoring into vMCP server #3101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR integrates health monitoring infrastructure into the vMCP server to enable periodic backend health checks with configurable intervals and thresholds. The implementation provides a new HTTP endpoint for querying backend health status, graceful degradation when health monitoring fails, and authentication bypass for health check requests.
Key Changes:
- Added health monitor lifecycle management (initialization, startup, and shutdown) in the vMCP server
- Introduced
/api/backends/healthHTTP endpoint to expose backend health status - Updated authentication strategies to skip authentication for health check requests
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/vmcp/server/server.go | Integrates health monitor into server lifecycle with configuration, initialization, start/stop management, new HTTP endpoint handler, and getter methods for health status |
| pkg/vmcp/server/health_monitoring_test.go | Comprehensive test coverage for health monitoring scenarios including disabled/enabled states, startup failures, HTTP endpoint behavior, and lifecycle management |
| pkg/vmcp/auth/strategies/tokenexchange.go | Updates token exchange authentication strategy to skip authentication for health check requests using context marker |
| pkg/vmcp/auth/strategies/header_injection.go | Updates header injection authentication strategy to skip authentication for health check requests using context marker |
| cmd/vmcp/app/commands.go | Configures health monitor from operational settings, mapping HealthCheckInterval and UnhealthyThreshold to health.MonitorConfig |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
80eaf93 to 6a3fcae Compare Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@ ## main #3101 +/- ## ========================================== + Coverage 57.08% 57.12% +0.04% ========================================== Files 341 341 Lines 33940 34037 +97 ========================================== + Hits 19376 19445 +69 - Misses 12961 12982 +21 - Partials 1603 1610 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
author taskbot <taskbot@users.noreply.github.com> 1766072123 +0100 committer taskbot <taskbot@users.noreply.github.com> 1766158585 +0100 Integrate health monitoring into vMCP server Integrates the health monitoring infrastructure (from previous into the vMCP server, enabling periodic backend health checks with configurable Related-to: #3036 intervals and thresholds. changes from review changes from review add missing method Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
a4eedc5 to 95dba05 Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if cfg.Operational != nil && cfg.Operational.FailureHandling != nil && cfg.Operational.FailureHandling.HealthCheckInterval > 0 { | ||
| defaults := health.DefaultConfig() | ||
| healthMonitorConfig = &health.MonitorConfig{ | ||
| CheckInterval: time.Duration(cfg.Operational.FailureHandling.HealthCheckInterval), | ||
| UnhealthyThreshold: cfg.Operational.FailureHandling.UnhealthyThreshold, | ||
| Timeout: defaults.Timeout, | ||
| DegradedThreshold: defaults.DegradedThreshold, | ||
| } |
Copilot AI Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The health check interval configuration is cast from a custom Duration type to time.Duration without validation. If HealthCheckInterval is zero or negative, it will pass through this check (line 338) but fail validation in health.NewMonitor, which expects CheckInterval > 0. Consider validating the HealthCheckInterval value here before passing it to the MonitorConfig to provide clearer error messages at configuration time.
| // Skip authentication for health checks | ||
| if health.IsHealthCheck(ctx) { | ||
| return nil | ||
| } |
Copilot AI Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly added health check bypass logic lacks test coverage. There should be a test case verifying that when health.IsHealthCheck(ctx) returns true, the Authenticate method returns nil without performing token exchange or requiring an identity in the context.
| // Skip authentication for health checks | ||
| if health.IsHealthCheck(ctx) { | ||
| return nil | ||
| } |
Copilot AI Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly added health check bypass logic lacks test coverage. There should be a test case verifying that when health.IsHealthCheck(ctx) returns true, the Authenticate method returns nil without injecting headers or validating the strategy configuration.
| // healthCheckContextKey is a marker for health check requests. | ||
| type healthCheckContextKey struct{} | ||
| | ||
| // WithHealthCheckMarker marks a context as a health check request. | ||
| // Authentication layers can use IsHealthCheck to identify and skip authentication | ||
| // for health check requests. | ||
| func WithHealthCheckMarker(ctx context.Context) context.Context { | ||
| return context.WithValue(ctx, healthCheckContextKey{}, true) | ||
| } | ||
| | ||
| // IsHealthCheck returns true if the context is marked as a health check. | ||
| // Authentication strategies use this to bypass authentication for health checks, | ||
| // since health checks verify backend availability and should not require user credentials. | ||
| func IsHealthCheck(ctx context.Context) bool { | ||
| val, ok := ctx.Value(healthCheckContextKey{}).(bool) | ||
| return ok && val | ||
| } |
Copilot AI Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The WithHealthCheckMarker and IsHealthCheck functions lack test coverage. These are critical security-related functions that control authentication bypass, and should have comprehensive tests verifying their behavior with valid contexts, nil values, and edge cases.
| | ||
| // healthMonitor performs periodic health checks on backends. | ||
| // Nil if health monitoring is disabled. | ||
| // Protected by healthMonitorMu for concurrent access from HTTP handlers. |
Copilot AI Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states the health monitor is "Protected by healthMonitorMu for concurrent access from HTTP handlers", but the mutex usage pattern shows RLock for all reads and Lock only for writes. The comment should clarify that reads use RLock (shared access) while writes use Lock (exclusive access) to better reflect the actual concurrency model.
| // Protected by healthMonitorMu for concurrent access from HTTP handlers. | |
| // Concurrency: Protected by healthMonitorMu. | |
| // - Read-only access (e.g., from HTTP handlers) must use healthMonitorMu.RLock/RUnlock. | |
| // - Mutations (e.g., initialize, replace, or stop the monitor) must use healthMonitorMu.Lock/Unlock. |
Integrates the health monitoring infrastructure (from previous into the vMCP server, enabling periodic backend health checks with configurable intervals and thresholds.
Related-to: #3036