Skip to content

Conversation

@yrobla
Copy link
Contributor

@yrobla yrobla commented Dec 18, 2025

Integrates the health monitoring infrastructure (from previous into the vMCP server, enabling periodic backend health checks with configurable intervals and thresholds.

Related-to: #3036

@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Dec 18, 2025
@yrobla yrobla requested a review from Copilot December 18, 2025 15:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates health monitoring infrastructure into the vMCP server to enable periodic backend health checks with configurable intervals and thresholds. The implementation provides a new HTTP endpoint for querying backend health status, graceful degradation when health monitoring fails, and authentication bypass for health check requests.

Key Changes:

  • Added health monitor lifecycle management (initialization, startup, and shutdown) in the vMCP server
  • Introduced /api/backends/health HTTP endpoint to expose backend health status
  • Updated authentication strategies to skip authentication for health check requests

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/vmcp/server/server.go Integrates health monitor into server lifecycle with configuration, initialization, start/stop management, new HTTP endpoint handler, and getter methods for health status
pkg/vmcp/server/health_monitoring_test.go Comprehensive test coverage for health monitoring scenarios including disabled/enabled states, startup failures, HTTP endpoint behavior, and lifecycle management
pkg/vmcp/auth/strategies/tokenexchange.go Updates token exchange authentication strategy to skip authentication for health check requests using context marker
pkg/vmcp/auth/strategies/header_injection.go Updates header injection authentication strategy to skip authentication for health check requests using context marker
cmd/vmcp/app/commands.go Configures health monitor from operational settings, mapping HealthCheckInterval and UnhealthyThreshold to health.MonitorConfig

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed labels Dec 18, 2025
@yrobla yrobla force-pushed the feat/issue-3036-healthcheck-2 branch from 80eaf93 to 6a3fcae Compare December 19, 2025 08:32
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Dec 19, 2025
@codecov
Copy link

codecov bot commented Dec 19, 2025

Codecov Report

❌ Patch coverage is 64.22018% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.12%. Comparing base (fb475a8) to head (95dba05).

Files with missing lines Patch % Lines
cmd/vmcp/app/commands.go 0.00% 18 Missing ⚠️
pkg/vmcp/server/server.go 82.50% 9 Missing and 5 partials ⚠️
pkg/vmcp/health/monitor.go 57.14% 3 Missing ⚠️
pkg/vmcp/auth/strategies/header_injection.go 0.00% 1 Missing and 1 partial ⚠️
pkg/vmcp/auth/strategies/tokenexchange.go 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@ Coverage Diff @@ ## main #3101 +/- ## ========================================== + Coverage 57.08% 57.12% +0.04%  ========================================== Files 341 341 Lines 33940 34037 +97 ========================================== + Hits 19376 19445 +69  - Misses 12961 12982 +21  - Partials 1603 1610 +7 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Base automatically changed from feat/issue-3036-healthcheck-1 to main December 19, 2025 14:49
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Dec 19, 2025
@yrobla yrobla requested a review from Copilot December 19, 2025 15:04
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Dec 19, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Dec 19, 2025
author taskbot <taskbot@users.noreply.github.com> 1766072123 +0100 committer taskbot <taskbot@users.noreply.github.com> 1766158585 +0100 Integrate health monitoring into vMCP server Integrates the health monitoring infrastructure (from previous into the vMCP server, enabling periodic backend health checks with configurable Related-to: #3036 intervals and thresholds. changes from review changes from review add missing method Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@yrobla yrobla force-pushed the feat/issue-3036-healthcheck-2 branch from a4eedc5 to 95dba05 Compare December 19, 2025 15:37
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Dec 19, 2025
@yrobla yrobla requested a review from Copilot December 19, 2025 15:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +338 to +345
if cfg.Operational != nil && cfg.Operational.FailureHandling != nil && cfg.Operational.FailureHandling.HealthCheckInterval > 0 {
defaults := health.DefaultConfig()
healthMonitorConfig = &health.MonitorConfig{
CheckInterval: time.Duration(cfg.Operational.FailureHandling.HealthCheckInterval),
UnhealthyThreshold: cfg.Operational.FailureHandling.UnhealthyThreshold,
Timeout: defaults.Timeout,
DegradedThreshold: defaults.DegradedThreshold,
}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The health check interval configuration is cast from a custom Duration type to time.Duration without validation. If HealthCheckInterval is zero or negative, it will pass through this check (line 338) but fail validation in health.NewMonitor, which expects CheckInterval > 0. Consider validating the HealthCheckInterval value here before passing it to the MonitorConfig to provide clearer error messages at configuration time.

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +100
// Skip authentication for health checks
if health.IsHealthCheck(ctx) {
return nil
}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newly added health check bypass logic lacks test coverage. There should be a test case verifying that when health.IsHealthCheck(ctx) returns true, the Authenticate method returns nil without performing token exchange or requiring an identity in the context.

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +68
// Skip authentication for health checks
if health.IsHealthCheck(ctx) {
return nil
}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newly added health check bypass logic lacks test coverage. There should be a test case verifying that when health.IsHealthCheck(ctx) returns true, the Authenticate method returns nil without injecting headers or validating the strategy configuration.

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +29
// healthCheckContextKey is a marker for health check requests.
type healthCheckContextKey struct{}

// WithHealthCheckMarker marks a context as a health check request.
// Authentication layers can use IsHealthCheck to identify and skip authentication
// for health check requests.
func WithHealthCheckMarker(ctx context.Context) context.Context {
return context.WithValue(ctx, healthCheckContextKey{}, true)
}

// IsHealthCheck returns true if the context is marked as a health check.
// Authentication strategies use this to bypass authentication for health checks,
// since health checks verify backend availability and should not require user credentials.
func IsHealthCheck(ctx context.Context) bool {
val, ok := ctx.Value(healthCheckContextKey{}).(bool)
return ok && val
}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WithHealthCheckMarker and IsHealthCheck functions lack test coverage. These are critical security-related functions that control authentication bypass, and should have comprehensive tests verifying their behavior with valid contexts, nil values, and edge cases.

Copilot uses AI. Check for mistakes.

// healthMonitor performs periodic health checks on backends.
// Nil if health monitoring is disabled.
// Protected by healthMonitorMu for concurrent access from HTTP handlers.
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states the health monitor is "Protected by healthMonitorMu for concurrent access from HTTP handlers", but the mutex usage pattern shows RLock for all reads and Lock only for writes. The comment should clarify that reads use RLock (shared access) while writes use Lock (exclusive access) to better reflect the actual concurrency model.

Suggested change
// Protected by healthMonitorMu for concurrent access from HTTP handlers.
// Concurrency: Protected by healthMonitorMu.
// - Read-only access (e.g., from HTTP handlers) must use healthMonitorMu.RLock/RUnlock.
// - Mutations (e.g., initialize, replace, or stop the monitor) must use healthMonitorMu.Lock/Unlock.
Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

3 participants