Skip to content

Conversation

amirejaz
Copy link
Contributor

@amirejaz amirejaz commented Oct 16, 2025

Problem

Existing workloads with plain text OAuth client secrets stored in their RunConfig would fail to restart after new ToolHive UI releases. This occurred because:

  1. JSON Format Mismatch: Stored RunConfigs used PascalCase field names (ClientID, ClientSecret) but new code expected snake_case (client_id, client_secret)
  2. Secret Exposure: Plain text secrets were being stored directly in RunConfig and exposed via API
  3. Migration Gap: No mechanism existed to convert old plain text secrets to the new CLI format

Solution

🔧 Backward Compatibility

  • Custom JSON Unmarshaling: Added UnmarshalJSON() method to RemoteAuthConfig that handles both old PascalCase and new snake_case formats
  • Smart Format Detection: Automatically detects JSON format and unmarshals accordingly

🔄 Migration Hook

  • Automatic Migration: Added migrateOAuthClientSecret() function in ReadJSON() that automatically converts plain text OAuth secrets to CLI format
  • One-time Migration: Migration only runs once per workload, then saves the updated RunConfig
  • Graceful Fallback: If migration fails, logs warning but doesn't fail the workload load

🛠️ Shared Utility Package

  • Centralized Logic: Created pkg/oauth/client_secret.go with shared OAuth secret processing functions
  • Consistent Processing: Both CLI and migration use the same logic for secret name generation and storage
  • Simplified API: Combined name generation and uniqueness checking into single GenerateUniqueSecretName() function

🔒 Security Improvements

  • No Plain Text Storage: OAuth client secrets are never stored as plain text in RunConfig
  • CLI Format References: All secrets stored as "SECRET_NAME,target=oauth_secret" format
  • Runtime Resolution: Secrets are resolved during workload execution via WithSecrets()

Key Changes

Files Modified

  • pkg/runner/config.go: Added backward compatibility and migration hook
  • pkg/oauth/client_secret.go: New shared utility package
  • cmd/thv/app/run_flags.go: Updated to use shared utility
  • pkg/api/v1/workloads.go: Export API now returns object format secrets
  • pkg/api/v1/workload_types.go: Updated conversion functions

New Functions

  • RemoteAuthConfig.UnmarshalJSON(): Handles both JSON formats
  • migrateOAuthClientSecret(): Converts plain text to CLI format
  • oauth.ProcessOAuthClientSecret(): Shared secret processing
  • oauth.GenerateUniqueSecretName(): Unified name generation

Testing

  • Backward Compatibility: Old PascalCase format loads correctly
  • New Format: New snake_case format works as expected
  • Mixed Format: Handles workloads with both old and new fields
  • Migration Logic: Plain text secrets converted to CLI format
  • API Consistency: Export and details APIs return object format
  • Runtime Resolution: CLI format secrets resolved during execution

Impact

Before: Existing workloads with plain text OAuth secrets would fail to restart
After: Seamless migration and backward compatibility for all existing workloads

This ensures that existing ToolHive deployments can continue running without interruption while benefiting from the new secure OAuth secret handling.

Copy link

codecov bot commented Oct 16, 2025

Codecov Report

❌ Patch coverage is 30.55556% with 100 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.34%. Comparing base (3453db0) to head (c74611a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/runner/config.go 5.79% 64 Missing and 1 partial ⚠️
pkg/oauth/client_secret.go 53.33% 33 Missing and 2 partials ⚠️
Additional details and impacted files
@@ Coverage Diff @@ ## main #2231 +/- ## ========================================== - Coverage 53.43% 53.34% -0.10%  ========================================== Files 230 231 +1 Lines 29385 29529 +144 ========================================== + Hits 15703 15753 +50  - Misses 12552 12644 +92  - Partials 1130 1132 +2 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@ChrisJBurns
Copy link
Collaborator

@claude Can you review this PR please?

Copy link
Contributor

claude bot commented Oct 16, 2025

Claude finished @ChrisJBurns's task —— View job


Code Review Complete

I've reviewed the OAuth client secret migration and backward compatibility implementation. Here's my comprehensive feedback:


🎯 Overall Assessment

Strengths:

  • ✅ Solid backward compatibility strategy with custom JSON unmarshaling
  • ✅ Good code organization by extracting shared logic into pkg/oauth/client_secret.go
  • ✅ Sensible migration approach that runs once during ReadJSON()
  • ✅ Appropriate error handling with warnings instead of failures for migration errors

Concerns:

  • ⚠️ Critical: Very low test coverage (4.41% patch coverage)
  • ⚠️ Potential circular dependency risk between packages
  • ⚠️ Missing edge case handling in migration logic
  • ⚠️ No integration tests for the migration path

🔍 Detailed Findings

1. Test Coverage (Critical Priority)

The codecov report shows 130 lines missing coverage across the changes:

  • pkg/oauth/client_secret.go: 65 missing lines (97% untested)
  • pkg/runner/config.go: 64 missing lines + 1 partial (migration logic untested)

Recommendations:

  • Add unit tests for pkg/oauth/client_secret.go covering:

    • ProcessOAuthClientSecret() with plain text input
    • ProcessOAuthClientSecret() with already-formatted CLI input
    • GenerateUniqueSecretName() with and without conflicts
    • StoreSecretInManager() with read-only providers
    • Error cases (invalid secrets manager, storage failures)
  • Add tests for pkg/runner/config.go covering:

    • migrateOAuthClientSecret() with plain text secrets
    • migrateOAuthClientSecret() with already-migrated secrets
    • migrateOAuthClientSecret() with empty/nil configs
    • RemoteAuthConfig.UnmarshalJSON() with old PascalCase format
    • RemoteAuthConfig.UnmarshalJSON() with new snake_case format
    • RemoteAuthConfig.UnmarshalJSON() with mixed/malformed input

File: pkg/runner/config.go:208-244, pkg/oauth/client_secret.go:14-114


2. Migration Logic - Error Handling

Issue: Migration failures are logged as warnings but the workload continues loading with potentially unresolved state.

Location: pkg/runner/config.go:208-211

if err := migrateOAuthClientSecret(&config); err != nil { logger.Warnf("Failed to migrate OAuth client secret for workload %s: %v", config.Name, err) // Don't fail the load - just log the warning and continue }

Concern: If migration fails, the workload loads with plain text secret still in config, but the secret might not be stored in the secrets manager. This could lead to:

  1. Plain text secrets remaining in storage (security issue)
  2. Workload functioning initially but failing on next restart
  3. Silent failures that are hard to diagnose

Recommendation:
Consider different failure modes:

  • If secret storage fails due to read-only provider → fail migration but allow load (current behavior OK)
  • If secret storage succeeds but SaveState() fails → log warning but continue (current behavior OK)
  • If secret name generation fails → this suggests a deeper issue, consider failing the load

Add comments explaining each failure scenario and why it's handled that way.


3. Backward Compatibility - UnmarshalJSON

Issue: The detection logic assumes PascalCase format if ClientID field exists.

Location: pkg/runner/config.go:530

if _, isOld := raw["ClientID"]; isOld {

Concern: What happens if someone has a JSON file with both ClientID and client_id fields? The current logic would use PascalCase and ignore snake_case.

Recommendation:

  • Add a comment documenting this edge case
  • Consider logging a warning if both formats are detected
  • Add a test case for this scenario

4. Secret Name Generation

Issue: Timestamp-based uniqueness might not be sufficient for high-frequency operations.

Location: pkg/oauth/client_secret.go:64-65

timestamp := time.Now().Unix() uniqueName := fmt.Sprintf("%s_%d", baseName, timestamp)

Concern: If two migrations happen within the same second (e.g., during bulk import or testing), they could generate the same secret name, leading to overwrites.

Recommendation:

  • Add a counter or use nanosecond precision: time.Now().UnixNano()
  • Or generate a short random suffix for additional uniqueness
  • Consider using a format like: OAUTH_CLIENT_SECRET_{workloadName}_{timestamp}_{random}

5. Package Dependencies

Observation: The new pkg/oauth package depends on pkg/config and pkg/secrets, which creates a dependency chain.

Location: pkg/oauth/client_secret.go:9-10

Concern: If pkg/secrets later needs OAuth functionality, we could create a circular dependency. This is more of a future-proofing concern.

Recommendation:

  • Document the dependency structure
  • Consider if the secrets manager retrieval could be injected as a dependency instead of directly created in the oauth package
  • Monitor for circular dependency warnings during builds

6. Code Quality - Positive Notes

Well done:

  • ✅ Good separation of concerns with ProcessOAuthClientSecret() as a reusable function
  • ✅ Appropriate use of context in StoreSecretInManager()
  • ✅ Clear error messages with wrapped errors using %w
  • ✅ Consistent naming conventions
  • ✅ Good documentation in PR description

7. Migration Idempotency

Positive: The migration correctly checks if a secret is already in CLI format before processing:

Location: pkg/runner/config.go:224-226

if _, err := secrets.ParseSecretParameter(config.RemoteAuthConfig.ClientSecret); err == nil { return nil // Already in CLI format, no migration needed }

This ensures the migration is idempotent and won't double-process secrets. Good design! ✅


8. Missing Test Cases

Based on the code, here are critical test scenarios that should be covered:

  1. Migration path tests:

    • Workload with old PascalCase format → should load and migrate
    • Workload already migrated → should load without re-migrating
    • Workload with read-only secrets provider → should fail gracefully
    • Multiple workloads with same name → should generate unique secret names
  2. Backward compatibility tests:

    • Load config with old PascalCase JSON
    • Load config with new snake_case JSON
    • Load config with mixed format (both fields present)
  3. Error handling tests:

    • Secrets manager not initialized
    • Secrets manager write fails
    • SaveState fails after migration

📋 Recommendations Summary

High Priority (Must Address):

  1. Add comprehensive unit tests to bring coverage above 80% for changed files
  2. Document migration failure scenarios in code comments
  3. Fix timestamp uniqueness issue for secret name generation
  4. Add integration test for the full migration path

Medium Priority (Should Address):

  1. Add test for mixed JSON format (both PascalCase and snake_case)
  2. Add warning log when both JSON formats are detected
  3. Consider dependency injection for secrets manager in oauth package

Low Priority (Nice to Have):

  1. Add metrics/telemetry for tracking successful migrations
  2. Consider a migration dry-run mode for testing
  3. Document the migration process in user-facing docs

🏁 Conclusion

This is a well-architected solution to a real backward compatibility problem. The approach is sound, and the code organization is clean. However, the lack of test coverage is a significant blocker that should be addressed before merging.

The migration logic itself is solid, but needs comprehensive testing to ensure it handles all edge cases correctly, especially given that this code will be handling sensitive OAuth credentials.

Recommendation: Request changes to add comprehensive test coverage before approving.


Job Run | Branch: migrate_oauth_config

@JAORMX
Copy link
Collaborator

JAORMX commented Oct 17, 2025

@claude please review this with a security point of view.

Copy link
Contributor

claude bot commented Oct 17, 2025

Claude finished @JAORMX's task —— View job


Security Review

Performing security-focused code review of OAuth client secret migration implementation.

Tasks

  • Read and analyze changed files
  • Review secret handling and storage mechanisms
  • Check for secret exposure vulnerabilities
  • Analyze migration logic for security risks
  • Review error handling and logging
  • Assess API endpoint security
  • Check for authentication/authorization issues
  • Verify test coverage for security-critical paths
  • Post comprehensive security findings

Security Analysis Complete

I've conducted a thorough security review of the OAuth client secret migration implementation. Here are my findings:


🔒 Critical Security Findings

1. Secret Exposure in Warning Logs - HIGH SEVERITY ⚠️

Location: pkg/runner/config.go:238

logger.Warnf("Failed to save migrated RunConfig for workload %s: %v", config.Name, err)

Issue: While this specific log doesn't expose secrets directly, the migration error path could potentially leak sensitive information if errors from the secrets manager contain secret values or metadata.

Recommendation:

  • Ensure error messages from SaveState() don't contain secret values
  • Consider sanitizing error messages before logging
  • Add explicit tests to verify no secrets are logged

File: pkg/runner/config.go:215-243


2. Migration Failure Leaves System in Inconsistent State - MEDIUM SEVERITY

Location: pkg/runner/config.go:208-210

if err := migrateOAuthClientSecret(&config); err != nil { return nil, fmt.Errorf("failed to migrate OAuth client secret: %w", err) }

Issue: The function returns an error on migration failure, which prevents workload loading. However, there's a subtle security risk:

  1. Secret is generated and stored in secrets manager (ProcessOAuthClientSecret)
  2. If SaveState() fails afterward (line 237), the secret exists in the manager but the RunConfig still references plain text
  3. On next restart, migration runs again, creating another secret in the manager
  4. This creates orphaned secrets that accumulate over time

Security Impact:

  • Secret sprawl: Multiple copies of the same credential in secrets manager
  • Cleanup difficulty: Orphaned secrets make auditing harder
  • Compliance risk: Old secrets may not be rotated/deleted properly

Recommendation:

// Wrap migration in a transaction-like pattern secretName, err := oauth.ProcessOAuthClientSecret(config.Name, config.RemoteAuthConfig.ClientSecret) if err != nil { return fmt.Errorf("failed to process OAuth client secret: %w", err) } // Update config with secret reference config.RemoteAuthConfig.ClientSecret = secretName // Attempt to save - if this fails, we should clean up the secret if err := config.SaveState(context.Background()); err != nil { // Best-effort cleanup of the secret we just created _ = deleteSecret(context.Background(), secretName) // Add this function return fmt.Errorf("failed to save migrated RunConfig: %w", err) }

File: pkg/runner/config.go:215-243


3. Race Condition in Secret Name Generation - MEDIUM SEVERITY

Location: pkg/oauth/client_secret.go:66-67

timestamp := time.Now().Unix() uniqueName := fmt.Sprintf("%s_%d", baseName, timestamp)

Issue: Using Unix second-precision timestamps for uniqueness creates a race condition window:

  1. Two workloads with the same name created within 1 second will generate identical secret names
  2. The second call will overwrite the first secret in the secrets manager
  3. This could cause authentication failures for the first workload

Attack Scenario:

  • Malicious user could deliberately create workloads rapidly to cause secret overwrites
  • Race condition during bulk imports or automated provisioning

Security Impact:

  • Secret overwrite: First workload's secret gets replaced
  • Authentication failure: First workload can no longer authenticate
  • Denial of service: Legitimate workloads fail to start

Recommendation:

// Use nanosecond precision + random suffix for collision resistance timestamp := time.Now().UnixNano() randomSuffix := make([]byte, 4) rand.Read(randomSuffix) uniqueName := fmt.Sprintf("%s_%d_%x", baseName, timestamp, randomSuffix)

Or better yet:

// Check in a loop until we find an available name for i := 0; i < 10; i++ { timestamp := time.Now().UnixNano() uniqueName := fmt.Sprintf("%s_%d", baseName, timestamp) _, err := secretManager.GetSecret(ctx, uniqueName) if err != nil { // Secret doesn't exist, we can use this name return uniqueName, nil } time.Sleep(1 * time.Millisecond) // Small delay before retry } return "", fmt.Errorf("failed to generate unique secret name after 10 attempts")

File: pkg/oauth/client_secret.go:44-69


4. Plain Text Secret Stored in Memory During Migration - LOW SEVERITY

Location: pkg/runner/config.go:228-234

cliFormatSecret, err := oauth.ProcessOAuthClientSecret(config.Name, config.RemoteAuthConfig.ClientSecret) // ...  config.RemoteAuthConfig.ClientSecret = cliFormatSecret

Issue: During migration, the plain text secret remains in the config struct until it's replaced. If the Go garbage collector doesn't clean up the old string immediately, it could remain in memory.

Security Impact:

  • Memory exposure: Plain text secret visible in process memory
  • Core dumps: Secret could be captured in crash dumps
  • Memory scanning: Advanced attacks could read process memory

Recommendation:

// Explicitly clear the old secret from memory oldSecret := config.RemoteAuthConfig.ClientSecret config.RemoteAuthConfig.ClientSecret = cliFormatSecret // Zero out the old secret string (best effort in Go) if oldSecret != cliFormatSecret { // Note: This doesn't actually work in Go due to string immutability // But documents the intent for future refactoring runtime.KeepAlive(oldSecret) // Prevent premature GC }

Note: This is a known limitation of Go's string immutability. For truly sensitive secrets, consider using []byte with explicit zeroing instead of string.

File: pkg/runner/config.go:215-243


🛡️ Positive Security Findings

Secret Format Validation

The code properly validates secret format before migration:

if _, err := secrets.ParseSecretParameter(config.RemoteAuthConfig.ClientSecret); err == nil { return nil // Already in CLI format }

This prevents double-migration and ensures idempotency. Good design!


Read-Only Provider Protection

The code checks if the secrets provider supports writes before attempting migration:

if !secretManager.Capabilities().CanWrite { return fmt.Errorf("secrets provider %s does not support writing secrets", providerType) }

This prevents failures on read-only providers like 1Password. Excellent defensive programming!


Secrets Resolution at Runtime

Secrets are resolved at runtime via WithSecrets() rather than stored in config:

actualSecret, err := secretManager.GetSecret(ctx, secretParam.Name) config.RemoteAuthConfig.ClientSecret = actualSecret

This ensures secrets are fetched fresh and not persisted. Correct approach!


Backward Compatibility Handling

Custom UnmarshalJSON handles both old PascalCase and new snake_case formats gracefully without exposing secrets. Well implemented!


🧪 Test Coverage Concerns

Missing Security-Critical Tests - HIGH PRIORITY

The codecov report shows 9.62% patch coverage with 122 lines untested. Critical security paths lack coverage:

Untested Security Scenarios:

  1. Secret exposure in error messages - No test verifies secrets aren't logged
  2. Migration failure with SaveState error - Orphaned secret scenario untested
  3. Concurrent secret generation - Race condition untested
  4. Read-only provider migration - Error path untested
  5. Mixed JSON format handling - Both PascalCase + snake_case together untested

Recommendation:
Add comprehensive security tests:

func TestMigrateOAuthClientSecret_NoSecretExposureInLogs(t *testing.T) func TestMigrateOAuthClientSecret_SaveStateFailure_OrphanedSecret(t *testing.T) func TestGenerateUniqueSecretName_ConcurrentCalls(t *testing.T) func TestMigrateOAuthClientSecret_ReadOnlyProvider(t *testing.T) func TestRemoteAuthConfig_UnmarshalJSON_MixedFormat(t *testing.T)

Files needing tests:

  • pkg/oauth/client_secret.go: Lines 24-33, 46-69, 72-93
  • pkg/runner/config.go: Lines 208-243, 380-393

📋 Additional Security Recommendations

5. Add Audit Logging for Secret Migration

Recommendation: Add audit events when secrets are migrated to track security-relevant operations:

if err := migrateOAuthClientSecret(&config); err != nil { audit.Log(ctx, audit.SecretMigrationFailed, config.Name, err) return nil, fmt.Errorf("failed to migrate OAuth client secret: %w", err) } audit.Log(ctx, audit.SecretMigrationSuccess, config.Name, nil)

6. Document Security Assumptions

Recommendation: Add security documentation to the code:

// SECURITY NOTE: This migration converts plain text OAuth secrets to CLI format references. // The actual secret value is stored in the configured secrets manager (1Password, encrypted storage, etc.) // and is resolved at runtime. This prevents secrets from being persisted in RunConfig JSON files. // // Security properties: // - Secrets are never logged (verified in tests) // - Migration is idempotent (safe to run multiple times) // - Failure to save migrated config leaves secret in manager (requires manual cleanup) // - Read-only providers will fail with clear error message

7. Consider Secret Versioning

For future enhancements, consider adding secret versioning to track migrations:

type RemoteAuthConfig struct { ClientSecret string `json:"client_secret,omitempty"` SecretVersion int `json:"secret_version,omitempty"` // 1 = plain text, 2 = CLI format }

This would make debugging and auditing easier.


🎯 Summary

Overall Security Posture: GOOD with notable concerns

The migration implementation follows secure practices for the most part:

  • ✅ Secrets stored securely in secrets manager
  • ✅ Runtime resolution prevents persistence
  • ✅ Read-only provider protection
  • ✅ Idempotent migration logic

Critical Issues to Address:

  1. HIGH: Race condition in secret name generation (could cause secret overwrites)
  2. MEDIUM: Orphaned secrets from SaveState failures (secret sprawl)
  3. HIGH: Insufficient test coverage for security-critical paths (9.62%)

Recommendation: Request changes to address the race condition and add comprehensive security tests before merging.


Risk Assessment

Finding Severity Likelihood Impact Priority
Race condition in secret generation Medium Medium High HIGH
Orphaned secrets on SaveState failure Medium Low Medium MEDIUM
Missing security tests High N/A High HIGH
Secret exposure in logs Low Low High MEDIUM
Plain text in memory Low Very Low Medium LOW

Job Run | Branch: migrate_oauth_config

@amirejaz amirejaz merged commit 1c3e1bc into main Oct 17, 2025
30 checks passed
@amirejaz amirejaz deleted the migrate_oauth_config branch October 17, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants