Skip to content

Conversation

breardon2011
Copy link
Contributor

No description provided.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR introduces a comprehensive context-aware logging system that replaces the basic slog implementation throughout the Digger backend. The changes implement structured JSON logging with request ID tracing, goroutine-specific logger management, and intelligent argument parsing.

The core addition is the new backend/logging/context_logger.go file which provides a sophisticated logging infrastructure that can associate logs with specific HTTP requests through context propagation. The system uses goroutine ID extraction to maintain logger state across asynchronous operations, enabling request correlation even when operations span multiple goroutines.

The implementation includes a new HTTP middleware (backend/logging/middleware.go) that generates X-Request-ID headers and creates request-scoped loggers with comprehensive metadata (method, path, IP, latency). This middleware replaces the previous sloggin-based approach in the bootstrap configuration.

Throughout the codebase, goroutines in controllers are modified to inherit request logging context using defer logging.InheritRequestLogger(ctx)(), ensuring that asynchronous operations like webhook processing, cache updates, and job completion handlers maintain proper log correlation. The system migrates from simple text logging to structured JSON logging with consistent key naming patterns.

The changes also include standard IDE support improvements by adding .vscode/ to the gitignore file, aligning with existing JetBrains IDE exclusions. The logging system maintains backward compatibility while providing enhanced observability features essential for debugging distributed operations in the Digger platform.

Confidence score: 4/5

  • This PR introduces significant architectural changes to the logging system but appears well-implemented with proper error handling and fallbacks
  • Score reflects the complexity of goroutine ID extraction and context propagation which could have edge cases in high-concurrency scenarios
  • Pay close attention to the goroutine lifecycle management and cleanup in the new logging system to ensure no memory leaks occur

Context used:

Context - Enrich debug logs with relevant context. Always include identifiers like 'prNumber' in logs to facilitate easier tracing and troubleshooting. (link)
Context - Use consistent and descriptive keys for debug logging. For example, prefer 'prBatchCount' over 'len(prBatches)' to maintain clarity and adhere to naming conventions. (link)

8 files reviewed, 2 comments

Edit Code Review Bot Settings | Greptile

Comment on lines +36 to +46
func (h *contextAwareHandler) Handle(ctx context.Context, r slog.Record) error {
// Try to get request-scoped logger from goroutine map first
logger := getGoroutineLogger()

if logger != nil {
return logger.Handler().Handle(ctx, r)
}

// Fall back to the wrapped handler
return h.handler.Handle(ctx, r)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Potential infinite recursion if goroutine logger's handler is also a contextAwareHandler. The logger.Handler().Handle(ctx, r) call could loop back here.

Comment on lines 143 to 145
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Missing context inheritance for PullRequestEvent handler, inconsistent with other event handlers

Suggested change
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
go func(ctx context.Context) {
defer logging.InheritRequestLogger(ctx)()
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())

Context Used: Context - Enrich debug logs with relevant context. Always include identifiers like 'prNumber' in logs to facilitate easier tracing and troubleshooting. (link)

This comment has been minimized.

defer logging.InheritRequestLogger(ctx)()

slog.Debug("Starting post-success job processing", "jobId", jobId)
logging.Debug("Starting post-success job processing", "job_id", jobId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be slog?

Comment on lines 142 to 145
// run it as a goroutine to avoid timeouts
go handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an inconsistency in how context is passed to goroutines in the handlePullRequestEvent function. Unlike other event handlers in the same file (like handlePushEvent and handleIssueCommentEvent), this goroutine is missing the defer logging.InheritRequestLogger(ctx)() call which is needed to properly propagate the request logger to the goroutine. This means that logs from this goroutine won't have the same context information as the parent request, making debugging more difficult.

Suggested change
// run it as a goroutine to avoid timeouts
go handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
// run it as a goroutine to avoid timeouts
go func(ctx context.Context) {
defer logging.InheritRequestLogger(ctx)()
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Comment on lines +173 to +197
// Helper function to get goroutine ID
const (
goroutineStackBufferSize = 64 // Make constant
)

func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]

// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses runtime.Stack to extract goroutine IDs, which is not an officially supported way to get goroutine IDs in Go. This approach relies on parsing the output format of runtime.Stack, which could change in future Go versions, potentially breaking this code. While there's no official API for getting goroutine IDs in Go, the code should at least be clearly documented with a warning about this limitation. The fix adds a warning comment to make future maintainers aware of this risk.

Suggested change
// Helper function to get goroutine ID
const (
goroutineStackBufferSize = 64 // Make constant
)
func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]
// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
}
// Helper function to get goroutine ID
// WARNING: This uses an unofficial approach to get goroutine IDs by parsing runtime.Stack output.
// This is not guaranteed to work in future Go versions and should be replaced with a more robust solution.
const (
goroutineStackBufferSize = 64 // Make constant
)
func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]
// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
}
Comment on lines +200 to +225
func InheritRequestLogger(ctx context.Context) (cleanup func()) {
if ctx == nil {
return func() {}
}

// Check if context is already cancelled
select {
case <-ctx.Done():
// Context cancelled, don't inherit
return func() {}
default:
// Context still active, proceed
}

if reqLogger := From(ctx); reqLogger != nil {
newGID := GetGoroutineID()
if newGID == 0 {
// Don't store with fallback ID
return func() {}
}

StoreGoroutineLogger(newGID, reqLogger)
return func() { DeleteGoroutineLogger(newGID) }
}
return func() {}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The InheritRequestLogger function returns a cleanup function that should be deferred to remove the goroutine logger from the goroutineLoggers sync.Map when the goroutine is done. However, the comment explaining the cleanup is missing, which could lead to developers not understanding the importance of calling the returned function. If the cleanup function is not called, the goroutineLoggers map could grow unbounded as new goroutines are created, leading to a memory leak. The fix adds a comment to emphasize the importance of the cleanup function for preventing memory leaks.

Suggested change
func InheritRequestLogger(ctx context.Context) (cleanup func()) {
if ctx == nil {
return func() {}
}
// Check if context is already cancelled
select {
case <-ctx.Done():
// Context cancelled, don't inherit
return func() {}
default:
// Context still active, proceed
}
if reqLogger := From(ctx); reqLogger != nil {
newGID := GetGoroutineID()
if newGID == 0 {
// Don't store with fallback ID
return func() {}
}
StoreGoroutineLogger(newGID, reqLogger)
return func() { DeleteGoroutineLogger(newGID) }
}
return func() {}
}
func InheritRequestLogger(ctx context.Context) (cleanup func()) {
if ctx == nil {
return func() {}
}
// Check if context is already cancelled
select {
case <-ctx.Done():
// Context cancelled, don't inherit
return func() {}
default:
// Context still active, proceed
}
if reqLogger := From(ctx); reqLogger != nil {
newGID := GetGoroutineID()
if newGID == 0 {
// Don't store with fallback ID
return func() {}
}
StoreGoroutineLogger(newGID, reqLogger)
return func() {
// Always clean up the goroutine logger to prevent memory leaks
DeleteGoroutineLogger(newGID)
}
}
return func() {}
}
Comment on lines 306 to 309
// Backward compatibility functions (optional - can be removed if not needed)
func InfoMsg(msg string, attrs ...any) { slog.Default().Info(msg, attrs...) }
func ErrorMsg(msg string, attrs ...any) { slog.Default().Error(msg, attrs...) }
func DebugMsg(msg string, attrs ...any) { slog.Default().Debug(msg, attrs...) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backward compatibility functions (InfoMsg, ErrorMsg, DebugMsg) are marked as "optional - can be removed if not needed", but removing them without checking all usages could break code elsewhere that's still using them. This is especially risky since the codebase shows inconsistent usage of logging functions, with some parts using slog.Error and others using logging.Error. The fix updates the comment to clearly indicate that these functions should not be removed without checking all usages first, preventing potential future breakage.

Suggested change
// Backward compatibility functions (optional - can be removed if not needed)
func InfoMsg(msg string, attrs ...any) { slog.Default().Info(msg, attrs...) }
func ErrorMsg(msg string, attrs ...any) { slog.Default().Error(msg, attrs...) }
func DebugMsg(msg string, attrs ...any) { slog.Default().Debug(msg, attrs...) }
// Backward compatibility functions - DO NOT REMOVE without checking all usages
// These functions are used for compatibility with code that hasn't been migrated
// to the new logging system yet. They should be kept until all code is migrated.
func InfoMsg(msg string, attrs ...any) { slog.Default().Info(msg, attrs...) }
func ErrorMsg(msg string, attrs ...any) { slog.Default().Error(msg, attrs...) }
func DebugMsg(msg string, attrs ...any) { slog.Default().Debug(msg, attrs...) }
Comment on lines +75 to +86
// Create the base JSON handler
baseHandler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
Level: level,
ReplaceAttr: func(_ []string, a slog.Attr) slog.Attr {
if a.Key == slog.LevelKey {
a.Key = "severity"
} else if a.Key == slog.TimeKey {
a.Value = slog.StringValue(a.Value.Time().Format(time.RFC3339Nano))
}
return a
},
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging format has been changed from text to JSON in the new logging package, but there's no comment or documentation about this change. This could break existing log parsing or monitoring systems that expect the old text format. The fix adds a comment explaining the format change and suggesting a configuration option for backward compatibility, making the change more explicit and helping developers understand potential impacts.

Suggested change
// Create the base JSON handler
baseHandler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
Level: level,
ReplaceAttr: func(_ []string, a slog.Attr) slog.Attr {
if a.Key == slog.LevelKey {
a.Key = "severity"
} else if a.Key == slog.TimeKey {
a.Value = slog.StringValue(a.Value.Time().Format(time.RFC3339Nano))
}
return a
},
})
// Create the base JSON handler
// NOTE: This changes the logging format from text to JSON, which might affect log parsing systems
// If you need to maintain compatibility with existing log parsers, consider adding a configuration
// option to switch between text and JSON formats
baseHandler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
Level: level,
ReplaceAttr: func(_ []string, a slog.Attr) slog.Attr {
if a.Key == slog.LevelKey {
a.Key = "severity"
} else if a.Key == slog.TimeKey {
a.Value = slog.StringValue(a.Value.Time().Format(time.RFC3339Nano))
}
return a
},
})

This comment has been minimized.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This review covers only the changes made since the last review (commit 68c9f74), not the entire PR.

The latest changes continue the logging system migration from custom logging to Go's standard log/slog package. The changes include:

  1. backend/controllers/github.go: Successfully migrated a logging.Error call to slog.Error with proper structured logging parameters including error context and repository information.

  2. backend/logging/context_logger.go: Removed the smart logging functions (Info, Error, Debug, Warn) that provided flexible argument parsing and context-aware logging. Also removed backward compatibility functions (InfoMsg, ErrorMsg, DebugMsg). This creates a breaking change requiring all callers to use explicit context passing or direct slog calls.

  3. backend/models/setup.go: Attempted to switch from custom logging to slog but created compilation errors by using slog.Error and slog.Info calls without importing the log/slog package, while still importing and using the custom logging package for logger initialization.

  4. backend/controllers/projects.go: Successfully migrated a single logging.Debug call to slog.Debug in a goroutine that properly inherits request logger context.

These changes are part of the broader architectural shift to standardize logging across the codebase using structured logging with consistent key-value pairs for better observability.

Confidence score: 1/5

  • This PR has critical compilation issues that will prevent it from building successfully
  • Score reflects the incomplete migration in setup.go that introduces import errors and potential breaking changes from removing logging functions
  • backend/models/setup.go requires immediate attention to fix missing slog imports and resolve the mixed logging approach

Context used:

Context - Enrich debug logs with relevant context. Always include identifiers like 'prNumber' in logs to facilitate easier tracing and troubleshooting. (link)
Context - Use consistent and descriptive keys for debug logging. For example, prefer 'prBatchCount' over 'len(prBatches)' to maintain clarity and adhere to naming conventions. (link)

4 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

Note: This review covers only the changes made since the last review (commit 64d1ce9), not the entire PR.

The latest changes continue the migration from the custom logging package to Go's standard slog library. Two specific updates were made:

  1. backend/controllers/projects.go: Replaced a single instance of logging.Error with slog.Error in the panic recovery handler within the SetJobStatusForProject function. This maintains the same structured logging format while using the standard library directly.

  2. backend/models/setup.go: Replaced manual logger creation with a call to the centralized logging.Default() system, and migrated from the github.com/orandin/slog-gorm package to github.com/imdatngo/slog-gorm/v2. However, the code still uses direct slog.Error() and slog.Info() calls in some places, which bypasses the centralized logging system.

These changes are part of the broader effort to standardize logging across the application by eliminating custom logging wrappers and using Go's structured logging consistently. The migration to logging.Default() ensures consistent log formatting and centralized configuration, while the GORM logging library update suggests moving to a more compatible version.

Confidence score: 3/5

  • This PR has mixed logging approaches that could cause inconsistencies in production
  • Score reflects concerns about bypassing centralized logging and potential library compatibility issues
  • Pay close attention to backend/models/setup.go for mixed logging patterns

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

This comment has been minimized.

Comment on lines 142 to 145
// run it as a goroutine to avoid timeouts
go handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goroutine for handling pull request events doesn't use the defer logging.InheritRequestLogger(ctx)() pattern that is used in the other goroutines (PushEvent and IssueCommentEvent). This is inconsistent and means that the pull request event handler won't properly inherit the request logger from the parent context, which could lead to logs from this goroutine not having the proper request context information (like request IDs or other contextual data).

Suggested change
// run it as a goroutine to avoid timeouts
go handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
// run it as a goroutine to avoid timeouts
go func(ctx context.Context) {
defer logging.InheritRequestLogger(ctx)()
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Comment on lines +178 to +197
func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]

// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goroutine tracking mechanism in context_logger.go relies on parsing stack traces to get goroutine IDs, which is not reliable across different Go versions and environments. The current implementation:

  1. Depends on the specific format of Go's stack traces ("goroutine 123 [running]:")
  2. Could break if the Go runtime team changes the stack trace format in future versions
  3. Has minimal error handling and could silently fail
  4. Doesn't document the fragility of this approach

While the code has some error handling, it's still fundamentally relying on an implementation detail of Go that isn't guaranteed to be stable. This could lead to logging failures or incorrect log attribution in different environments or future Go versions.

The improved implementation adds:

  1. Better error handling
  2. More explicit checks
  3. A comment warning about the reliability issues
  4. Clearer code structure

A more robust long-term solution would be to use a dedicated goroutine-local storage library or to rely more heavily on context propagation patterns rather than trying to detect goroutine IDs automatically.

Comment on lines 142 to 145
// run it as a goroutine to avoid timeouts
go handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context is passed to the goroutine in the PullRequestEvent handler but the InheritRequestLogger function is not called, unlike in other similar goroutines in the same file. This means that the request logger context is not properly inherited in the goroutine, which could lead to inconsistent or missing request-specific logging information.

The InheritRequestLogger function is used to propagate the request-specific logger to a new goroutine, ensuring that logs from the goroutine maintain the same request context (like request IDs, trace IDs, etc.). This is important for maintaining proper log correlation across asynchronous operations.

Other similar goroutines in the same file (for PushEvent and IssueCommentEvent) correctly use this pattern, but it's missing in the PullRequestEvent handler.

Suggested change
// run it as a goroutine to avoid timeouts
go handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
// run it as a goroutine to avoid timeouts
go func(ctx context.Context) {
defer logging.InheritRequestLogger(ctx)()
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Comment on lines 440 to 450
repoCacheEnabled := os.Getenv("DIGGER_CONFIG_REPO_CACHE_ENABLED")
if repoCacheEnabled == "1" && strings.HasSuffix(ref, defaultBranch) {
go func() {
go func(ctx context.Context) {
if err := sendProcessCacheRequest(repoFullName, defaultBranch, installationId); err != nil {
slog.Error("Failed to process cache request", "error", err, "repoFullName", repoFullName)
slog.Error("Failed to process cache request", map[string]any{
"error": err,
"repo_full_name": repoFullName,
})
}
}()
}(ctx)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bug in the goroutine that calls sendProcessCacheRequest in the handlePushEvent function. The context is passed to the goroutine, but sendProcessCacheRequest doesn't accept a context parameter, making the context parameter unused.

The fix:

  1. Removes the unused context parameter from the goroutine function signature
  2. Adds the logging.InheritRequestLogger(ctx)() defer call to properly inherit the request logger from the parent context
  3. This ensures that logging context is properly maintained across goroutines while removing the unused parameter

This pattern is consistent with how other goroutines are handled in the codebase, such as in the UpdateRepoCache function in cache.go.

Suggested change
repoCacheEnabled := os.Getenv("DIGGER_CONFIG_REPO_CACHE_ENABLED")
if repoCacheEnabled == "1" && strings.HasSuffix(ref, defaultBranch) {
go func() {
go func(ctx context.Context) {
if err := sendProcessCacheRequest(repoFullName, defaultBranch, installationId); err != nil {
slog.Error("Failed to process cache request", "error", err, "repoFullName", repoFullName)
slog.Error("Failed to process cache request", map[string]any{
"error": err,
"repo_full_name": repoFullName,
})
}
}()
}(ctx)
}
repoCacheEnabled := os.Getenv("DIGGER_CONFIG_REPO_CACHE_ENABLED")
if repoCacheEnabled == "1" && strings.HasSuffix(ref, defaultBranch) {
go func() {
defer logging.InheritRequestLogger(ctx)()
if err := sendProcessCacheRequest(repoFullName, defaultBranch, installationId); err != nil {
slog.Error("Failed to process cache request", map[string]any{
"error": err,
"repo_full_name": repoFullName,
})
}
}()
}
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This review covers only the changes made since the last review (commit 924e7dd), not the entire PR.

The latest changes focus on enhancing structured logging consistency across two controller files. In backend/controllers/github.go, the changes reorganize imports for better grouping and update the cache request error logging to use slog's native key-value pair format instead of map syntax. The specific change converts slog.Error("Failed to process cache request", map[string]any{"error": err, "repo_full_name": repoFullName}) to slog.Error("Failed to process cache request", "error", err, "repoFullName", repoFullName).

In backend/controllers/projects.go, the changes include import reordering for consistency, reformatting a panic recovery handler to use structured logging with proper error grouping and stack trace formatting, and removing excess blank lines. The panic handler now uses slog with grouped stack trace information and includes contextual identifiers like jobId.

These changes are part of a broader logging enhancement initiative to standardize slog usage patterns throughout the codebase. The key-value pair approach is more efficient as it avoids map allocations and aligns with slog's idiomatic usage. The structured logging improvements will enhance debugging capabilities in production environments, particularly for log aggregation systems that can better parse the grouped and contextualized log entries.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it focuses on logging improvements without changing core business logic
  • Score reflects simple logging format changes with good adherence to slog best practices, though there's a minor risk of infinite recursion in contextAwareHandler
  • Pay close attention to the panic recovery handler in projects.go to ensure the contextAwareHandler doesn't create recursive loops

Context used:

Context - Enrich debug logs with relevant context. Always include identifiers like 'prNumber' in logs to facilitate easier tracing and troubleshooting. (link)
Context - Use consistent and descriptive keys for debug logging. For example, prefer 'prBatchCount' over 'len(prBatches)' to maintain clarity and adhere to naming conventions. (link)

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

This comment has been minimized.

Comment on lines 144 to 146
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handlePullRequestEvent goroutine is passed the context but doesn't use the defer logging.InheritRequestLogger(ctx)() pattern like the other goroutines in the file. This is inconsistent with the pattern used for other event handlers like handlePushEvent and handleIssueCommentEvent.

The InheritRequestLogger function is designed to propagate the request-scoped logger to new goroutines, ensuring that logs from background tasks maintain the same request context (like request IDs, correlation IDs, etc.). Without this, logs from the pull request event handler won't be properly associated with the original request, making debugging and tracing more difficult.

This appears to be an oversight since all other similar goroutines in the file correctly use this pattern.

Suggested change
go func(ctx context.Context) {
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())
go func(ctx context.Context) {
defer logging.InheritRequestLogger(ctx)()
handlePullRequestEvent(gh, event, d.CiBackendProvider, appId64)
}(c.Request.Context())

This comment has been minimized.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This review covers only the changes made since the last review (commit c34047d), not the entire PR.

The most recent changes address two important logging system improvements:

  1. PullRequestEvent Handler Context Inheritance: Added the missing defer logging.InheritRequestLogger(ctx)() call to the PullRequestEvent handler goroutine in backend/controllers/github.go. This ensures consistency with other event handlers (PushEvent and IssueCommentEvent) that already had this context inheritance pattern. This change enables proper request context propagation to the goroutine, ensuring logs maintain request-specific information like request IDs and trace data.

  2. Safer Fallback Logger in GetBaseLogger: Modified the GetBaseLogger function in backend/logging/context_logger.go to create a dedicated JSON logger with proper configuration instead of returning slog.Default(). This prevents potential infinite recursion loops that could occur if slog.Default() uses a contextAwareHandler, since the Init() function sets slog.Default() to use the context-aware handler. The new fallback logger maintains the same JSON format and app labeling as the main logger but without the context-aware wrapping.

Both changes improve the robustness of the logging system - the first ensures consistent context propagation across all GitHub event handlers, while the second provides a safer fallback mechanism for middleware usage where context-aware logging might cause circular dependencies.

Confidence score: 4/5

  • This PR addresses important logging consistency issues with minimal risk of breaking existing functionality
  • Score reflects the fact that these are targeted improvements to existing logging infrastructure rather than new feature additions
  • The GetBaseLogger change requires careful testing to ensure no existing code depends on the previous fallback behavior

Context used:

Context - Enrich debug logs with relevant context. Always include identifiers like 'prNumber' in logs to facilitate easier tracing and troubleshooting. (link)

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This review covers only the changes made since the last review (commit 0a88cf5), not the entire PR.

The most recent changes include two key modifications:

  1. Cache processing goroutine update (backend/controllers/github.go): The goroutine that processes cache requests in the handlePushEvent function has been simplified by removing the unnecessary ctx context.Context parameter. This change aligns with the new logging architecture where context inheritance is now handled automatically through the InheritRequestLogger function rather than explicit parameter passing. The goroutine now matches the pattern used by other goroutines in the same file.

  2. Periodic cleanup mechanism (backend/logging/context_logger.go): A cleanup system has been added to prevent memory leaks in the goroutine logger tracking system. A global cleanupTicker variable and corresponding goroutine have been introduced to run cleanup operations every 10 minutes, removing stale entries from the goroutineLoggers sync.Map.

These changes are part of a broader refactoring effort to improve context-aware logging throughout the application. The cache processing change creates consistency with other goroutines that use the defer logging.InheritRequestLogger(ctx)() pattern, while the cleanup mechanism addresses the potential for unbounded memory growth from terminated goroutines that weren't properly cleaned up.

Confidence score: 1/5

  • This PR has a critical compilation error that will prevent it from building
  • Score reflects the undefined cleanupStaleGoroutineLoggers() function call which will cause immediate build failure
  • The backend/logging/context_logger.go file requires immediate attention due to missing function implementation

Context used:

Context - Enrich debug logs with relevant context. Always include identifiers like 'prNumber' in logs to facilitate easier tracing and troubleshooting. (link)

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

This comment has been minimized.

Comment on lines +64 to +69
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code references a function cleanupStaleGoroutineLoggers() that is called by a goroutine every 10 minutes, but this function is not defined anywhere in the codebase. This will cause a runtime panic when the ticker fires after 10 minutes of application runtime.

The fix implements the missing function inline in the goroutine. It iterates through the goroutineLoggers map and removes all entries. This is a simple implementation that could be enhanced in the future with timestamps to only remove truly stale entries, but it's better than having a panic.

The current implementation will clean up all goroutine loggers every 10 minutes, which might cause some active goroutines to lose their loggers, but they will be recreated on the next request if needed, and this is better than having a runtime panic.

Suggested change
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
// Cleanup stale goroutine loggers that might be left in the map
var keysToDelete []uint64
goroutineLoggers.Range(func(key, value interface{}) bool {
// We can't directly check if a goroutine is still alive
// So we'll implement a time-based cleanup strategy
// This could be enhanced with timestamps in the future
keysToDelete = append(keysToDelete, key.(uint64))
return true
})
// Delete stale entries in a separate loop to avoid concurrent map iteration and deletion
for _, key := range keysToDelete {
goroutineLoggers.Delete(key)
}
}
}()
func ConnectDatabase() {
slogger := slog.New(slog.NewTextHandler(os.Stdout, &slog.HandlerOptions{Level: slog.LevelInfo}))

slogger := logging.Default()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a risk that logging.Default() might be called before logging.Init() is called. Looking at the codebase, I found that models.ConnectDatabase() is called in multiple places, including in backend/bootstrap/main.go (line 119), backend/tasks/tasks.go (line 27), ee/drift/main.go (line 64), and background/projects-refresh-service/projects_refesh_main.go (line 45).

In backend/bootstrap/main.go, logging.Init() is called on line 100 before models.ConnectDatabase() on line 119, which is safe. However, in the other files, they have their own initialization of logging but don't call logging.Init(). Instead, they directly set up their own logger with slog.SetDefault().

The issue is that logging.Default() in models/setup.go is a wrapper around slog.Default() but adds context-aware handling that depends on the initialization done in logging.Init(). If ConnectDatabase() is called before logging.Init(), it could lead to using an uninitialized context-aware handler.

The safer approach is to use slog.Default() directly in models/setup.go since it will always return the current default logger, regardless of whether it was set up by logging.Init() or by another initialization method.

Suggested change
slogger := logging.Default()
// Ensure we're using the initialized logger
slogger := slog.Default()
Comment on lines +61 to +69
// sets up the process wide base logger with automatic context detection
func Init() *slog.Logger {

cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code references a function cleanupStaleGoroutineLoggers() that is called by a ticker every 10 minutes, but this function is not defined anywhere in the codebase. This will cause a compilation error and prevent the application from starting.

The function needs to be implemented to prevent memory leaks from stale goroutine loggers that might accumulate in the goroutineLoggers sync.Map over time.

A proper implementation should:

  1. Iterate through all entries in the goroutineLoggers map
  2. Determine which goroutines are no longer active
  3. Remove the corresponding logger entries from the map

This is critical for long-running applications to prevent memory leaks.

Suggested change
// sets up the process wide base logger with automatic context detection
func Init() *slog.Logger {
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
// sets up the process wide base logger with automatic context detection
func Init() *slog.Logger {
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This review covers only the changes made since the last review (commit 3b4f02a), not the entire PR.

The recent changes implement a comprehensive time-based cleanup mechanism for the goroutine logger system to address memory leak issues. The key additions include:

  1. Separate Timestamp Tracking: A new goroutineTimestamps sync.Map has been added to track when each goroutine logger is stored, providing the foundation for time-based cleanup.

  2. Automated Cleanup Function: The missing cleanupStaleGoroutineLoggers() function has been implemented with sophisticated logic that:

    • Removes logger entries older than 30 minutes
    • Provides debug logging during cleanup operations
    • Warns when many stale entries are cleaned (potential goroutine leak indicator)
    • Runs every 10 minutes via the existing ticker mechanism
  3. Enhanced Logger Management: The StoreGoroutineLogger and DeleteGoroutineLogger functions now properly manage both logger storage and timestamp tracking, ensuring consistency between the two maps.

  4. Helper Function Addition: A new InheritRequestLogger function has been added to facilitate proper logger inheritance in new goroutines with context validation.

These changes integrate with the existing logging infrastructure by building upon the established patterns while addressing the critical memory leak issue that could occur in long-running applications where goroutine loggers accumulate without proper cleanup.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it implements a well-thought-out solution to prevent memory leaks
  • Score reflects solid implementation of time-based cleanup with proper error handling and monitoring capabilities
  • Pay close attention to the cleanup logic and timestamp synchronization between the two sync.Maps

Context used:

Context - Enrich debug logs with relevant context. Always include identifiers like 'prNumber' in logs to facilitate easier tracing and troubleshooting. (link)
Context - Use consistent and descriptive keys for debug logging. For example, prefer 'prBatchCount' over 'len(prBatches)' to maintain clarity and adhere to naming conventions. (link)

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

This comment has been minimized.

Comment on lines +222 to +241
func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]

// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation of GetGoroutineID() relies on parsing the runtime stack to extract the goroutine ID. This approach is not officially supported by Go and could break in future Go versions.

The issue is that the format of the stack trace is not guaranteed to remain stable across Go versions. If the format changes, the parsing logic would fail, potentially causing logging issues throughout the application.

The proposed fix implements a more reliable approach using goroutine-local storage with the sync.Map type. It uses the address of a stack-local variable as a unique key for each goroutine, which is a more reliable technique. The implementation still falls back to the stack parsing method for backward compatibility, but caches the result for future lookups.

Note: This implementation requires adding "unsafe" to the import list at the top of the file.

Comment on lines +65 to +70
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goroutineLoggers map in context_logger.go could grow unbounded in high-traffic systems, potentially causing memory issues. The current cleanup mechanism only runs every 10 minutes, which is too infrequent for high-traffic environments.

While the code does clean up stale entries older than 30 minutes, there's no limit on the maximum number of entries that can accumulate between cleanup cycles. In a high-traffic system, this could lead to excessive memory usage during those 10-minute intervals.

The fix reduces the cleanup interval from 10 minutes to 2 minutes to ensure more frequent cleanup of stale entries. This change helps prevent excessive memory usage in high-traffic scenarios while maintaining the existing cleanup logic.

Suggested change
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
// Run cleanup more frequently and add a cap to prevent unbounded growth
cleanupTicker = time.NewTicker(2 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Comment on lines +182 to +199
func cleanupStaleGoroutineLoggers() {
staleThreshold := 30 * time.Minute // Clean up entries older than 30 minutes
now := time.Now()
cleaned := 0
totalEntries := 0

goroutineTimestamps.Range(func(key, value interface{}) bool {
totalEntries++
gid := key.(uint64)
timestamp := value.(time.Time)

if now.Sub(timestamp) > staleThreshold {
goroutineLoggers.Delete(gid)
goroutineTimestamps.Delete(gid)
cleaned++
}
return true
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation of cleanupStaleGoroutineLoggers() only removes entries based on age (older than 30 minutes), but has no mechanism to limit the total number of entries in the map. In high-traffic systems, this could lead to unbounded growth between cleanup cycles, potentially causing memory issues.

The fix adds a maximum entry limit (10,000) to the cleanup function. If the number of entries exceeds this limit after removing stale entries, it will forcibly remove the oldest entries to keep the map size under control. This prevents unbounded growth while still prioritizing the removal of truly stale entries first.

This approach ensures that even under extreme load, the memory usage of the goroutineLoggers map remains bounded, protecting the system from potential memory exhaustion.

Comment on lines +65 to +70
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a goroutine leak in the logging package. In the Init() function, a cleanup ticker is started that runs every 10 minutes to clean up stale goroutine loggers. However, there is no mechanism to stop this ticker when the application shuts down.

If the application is frequently restarted (such as during development or in a containerized environment with frequent deployments), each restart will create a new goroutine running the cleanup function, but the old ones will never be stopped. Over time, this will lead to a buildup of goroutines and memory usage.

The fix adds proper cleanup using runtime.SetFinalizer to ensure the ticker is stopped when the application shuts down. It also integrates with the existing Sentry setup to ensure the cleanup happens during normal application shutdown.

Suggested change
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
// Register cleanup function with sentry to ensure ticker is stopped on shutdown
sentry.AddGlobalEventProcessor(func(event *sentry.Event) *sentry.Event {
// This is a no-op processor that just ensures the package is initialized
return event
})
sentry.Flush(time.Second)
sentry.ConfigureScope(func(scope *sentry.Scope) {
scope.AddEventProcessor(func(event *sentry.Event, hint *sentry.EventHint) *sentry.Event {
return event
})
})
// Register cleanup with atexit
runtime.SetFinalizer(&cleanupTicker, func(_ **time.Ticker) {
if cleanupTicker != nil {
cleanupTicker.Stop()
}
})
Comment on lines +181 to +199
// Simple time-based cleanup (no race conditions)
func cleanupStaleGoroutineLoggers() {
staleThreshold := 30 * time.Minute // Clean up entries older than 30 minutes
now := time.Now()
cleaned := 0
totalEntries := 0

goroutineTimestamps.Range(func(key, value interface{}) bool {
totalEntries++
gid := key.(uint64)
timestamp := value.(time.Time)

if now.Sub(timestamp) > staleThreshold {
goroutineLoggers.Delete(gid)
goroutineTimestamps.Delete(gid)
cleaned++
}
return true
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a race condition bug in the goroutine logger cleanup mechanism. If a goroutine ID is reused before the stale entry is removed, the new goroutine could inherit the wrong logger context.

The current implementation has two issues:

  1. It doesn't handle goroutine ID reuse properly - if a goroutine ID is reused before cleanup, the new goroutine will use the stale logger context.
  2. The cleanup function deletes entries while iterating over the map, which could lead to race conditions if the map is being accessed concurrently.

The fix separates the collection of stale entries from their deletion to avoid race conditions during cleanup. This doesn't completely solve the goroutine ID reuse problem, but it makes the cleanup process safer by avoiding concurrent map iteration and deletion.

A more comprehensive solution would require a different approach to tracking goroutine contexts, possibly using a combination of goroutine ID and creation timestamp as a composite key, but that would require more extensive changes to the codebase.

Suggested change
// Simple time-based cleanup (no race conditions)
func cleanupStaleGoroutineLoggers() {
staleThreshold := 30 * time.Minute // Clean up entries older than 30 minutes
now := time.Now()
cleaned := 0
totalEntries := 0
goroutineTimestamps.Range(func(key, value interface{}) bool {
totalEntries++
gid := key.(uint64)
timestamp := value.(time.Time)
if now.Sub(timestamp) > staleThreshold {
goroutineLoggers.Delete(gid)
goroutineTimestamps.Delete(gid)
cleaned++
}
return true
})
// Simple time-based cleanup (no race conditions)
func cleanupStaleGoroutineLoggers() {
staleThreshold := 30 * time.Minute // Clean up entries older than 30 minutes
now := time.Now()
cleaned := 0
totalEntries := 0
// First collect all stale entries to avoid race conditions during cleanup
var staleEntries []uint64
goroutineTimestamps.Range(func(key, value interface{}) bool {
totalEntries++
gid := key.(uint64)
timestamp := value.(time.Time)
if now.Sub(timestamp) > staleThreshold {
staleEntries = append(staleEntries, gid)
cleaned++
}
return true
})
// Then delete them in a separate loop to avoid concurrent map iteration and deletion
for _, gid := range staleEntries {
goroutineLoggers.Delete(gid)
goroutineTimestamps.Delete(gid)
}
Comment on lines +181 to +183
// Simple time-based cleanup (no race conditions)
func cleanupStaleGoroutineLoggers() {
staleThreshold := 30 * time.Minute // Clean up entries older than 30 minutes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanupStaleGoroutineLoggers function uses a hard-coded 30-minute threshold to determine if a goroutine logger entry is stale. This is too long for a high-traffic system and could lead to memory bloat if many requests are processed.

I've added a new helper function getStaleThreshold() that makes the threshold configurable via the DIGGER_LOGGER_STALE_THRESHOLD_MINUTES environment variable, with a more reasonable default of 5 minutes. This allows operators to tune the system based on their traffic patterns without code changes.

The function includes proper error handling and debug logging for invalid configurations.

Suggested change
// Simple time-based cleanup (no race conditions)
func cleanupStaleGoroutineLoggers() {
staleThreshold := 30 * time.Minute // Clean up entries older than 30 minutes
// getStaleThreshold returns the configured stale threshold from environment or default
func getStaleThreshold() time.Duration {
if thresholdStr := os.Getenv("DIGGER_LOGGER_STALE_THRESHOLD_MINUTES"); thresholdStr != "" {
if minutes, err := strconv.Atoi(thresholdStr); err == nil && minutes > 0 {
return time.Duration(minutes) * time.Minute
}
// Log invalid value in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Invalid DIGGER_LOGGER_STALE_THRESHOLD_MINUTES value, using default", "value", thresholdStr)
}
}
return 5 * time.Minute // Default to 5 minutes
}
// Simple time-based cleanup (no race conditions)
func cleanupStaleGoroutineLoggers() {
staleThreshold := getStaleThreshold() // Get configurable threshold with default of 5 minutes
Comment on lines +73 to +84
logLevel := os.Getenv("DIGGER_LOG_LEVEL")
var level slog.Leveler

if logLevel == "DEBUG" {
level = slog.LevelDebug
} else if logLevel == "WARN" {
level = slog.LevelWarn
} else if logLevel == "ERROR" {
level = slog.LevelError
} else {
level = slog.LevelInfo
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in context_logger.go doesn't validate the log level value provided through the DIGGER_LOG_LEVEL environment variable. If a user provides an invalid value (anything other than "DEBUG", "WARN", "ERROR", or "INFO"), the code silently defaults to the INFO level without any warning. This could lead to confusion for users who might think their custom log level is being applied when it's actually being ignored.

The fix adds validation for the log level and outputs a warning message when an invalid value is provided. This helps users identify configuration issues early. The warning is written directly to stderr to avoid potential circular dependencies with the logging system itself, which hasn't been fully initialized at this point.

Suggested change
logLevel := os.Getenv("DIGGER_LOG_LEVEL")
var level slog.Leveler
if logLevel == "DEBUG" {
level = slog.LevelDebug
} else if logLevel == "WARN" {
level = slog.LevelWarn
} else if logLevel == "ERROR" {
level = slog.LevelError
} else {
level = slog.LevelInfo
}
logLevel := os.Getenv("DIGGER_LOG_LEVEL")
var level slog.Leveler
if logLevel == "DEBUG" {
level = slog.LevelDebug
} else if logLevel == "WARN" {
level = slog.LevelWarn
} else if logLevel == "ERROR" {
level = slog.LevelError
} else if logLevel != "" && logLevel != "INFO" {
// Log a warning for invalid log level values
fmt.Fprintf(os.Stderr, "WARNING: Invalid log level '%s' provided. Using default INFO level instead.\n", logLevel)
level = slog.LevelInfo
} else {
level = slog.LevelInfo
}
Comment on lines +173 to +176
// Log error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Invalid logger type in goroutine map", "type", fmt.Sprintf("%T", logger))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code directly checks the DIGGER_LOG_LEVEL environment variable in multiple places after initialization. This can lead to inconsistent behavior if the environment variable changes during runtime, as some parts of the code will use the cached log level (set during Init()) while others will read the current environment variable value.

The specific issue is that the logging system initializes with the environment variable value at startup (lines 72-83), but then directly checks the environment variable again in several places (lines 173-175, 201-206, 234-236). If the environment variable changes during runtime (which is possible in some environments), the logging behavior would be inconsistent.

This fix replaces the direct environment variable check with a check against the already configured logger level, ensuring consistent behavior regardless of environment variable changes during runtime.

Suggested change
// Log error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Invalid logger type in goroutine map", "type", fmt.Sprintf("%T", logger))
}
// Log error in debug mode using the cached log level
if baseLogger != nil && baseLogger.Enabled(context.Background(), slog.LevelDebug) {
slog.Debug("Invalid logger type in goroutine map", "type", fmt.Sprintf("%T", logger))
}
Comment on lines +76 to +84
if logLevel == "DEBUG" {
level = slog.LevelDebug
} else if logLevel == "WARN" {
level = slog.LevelWarn
} else if logLevel == "ERROR" {
level = slog.LevelError
} else {
level = slog.LevelInfo
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current log level parsing in context_logger.go is case-sensitive and only accepts exact string matches for "DEBUG", "WARN", and "ERROR". This approach is inflexible and could lead to unexpected behavior when users provide log levels in different formats (e.g., lowercase "debug" or alternative spellings like "warning").

The improved implementation:

  1. Converts the input to uppercase for case-insensitive comparison
  2. Uses a switch statement for cleaner code and better maintainability
  3. Adds support for "WARNING" as an alternative to "WARN"
  4. Explicitly handles "INFO" case (though it was the default before)
  5. Maintains the same default behavior for unrecognized values

This change makes the logging system more user-friendly by accepting various common formats for log level specification while maintaining backward compatibility.

Suggested change
if logLevel == "DEBUG" {
level = slog.LevelDebug
} else if logLevel == "WARN" {
level = slog.LevelWarn
} else if logLevel == "ERROR" {
level = slog.LevelError
} else {
level = slog.LevelInfo
}
// Convert to uppercase for case-insensitive comparison
logLevel = strings.ToUpper(logLevel)
switch logLevel {
case "DEBUG":
level = slog.LevelDebug
case "WARN", "WARNING":
level = slog.LevelWarn
case "ERROR":
level = slog.LevelError
case "INFO":
level = slog.LevelInfo
default:
// Default to INFO if unrecognized
level = slog.LevelInfo
}
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This review covers only the changes made since the last review (commit d0e10b5), not the entire PR.

The most recent change adds proper context inheritance to a goroutine in the handlePushEvent function in backend/controllers/github.go. Specifically, the goroutine that processes cache requests now:

  1. Accepts the context as a parameter: go func(ctx context.Context)
  2. Inherits the request logger context using defer logging.InheritRequestLogger(ctx)()
  3. Properly cleans up the logger when the goroutine completes

This change ensures that the cache processing goroutine maintains the same logging context as the parent request, allowing for proper request tracing and correlation across asynchronous operations. The modification aligns with the established pattern used by other goroutines in the same file and is part of the broader logging improvement effort indicated by the PR title "logging extended".

The InheritRequestLogger function creates a request-scoped logger mapping in a goroutine-specific map and returns a cleanup function that removes the mapping when the goroutine terminates, preventing memory leaks and ensuring consistent log correlation across the application.

Confidence score: 5/5

  • This PR is safe to merge with minimal risk
  • Score reflects a simple, well-established pattern that improves logging consistency without changing core logic
  • No files require special attention

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

Comment on lines +173 to +176
// Log error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Invalid logger type in goroutine map", "type", fmt.Sprintf("%T", logger))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code directly checks for the exact string "DEBUG" when determining whether to log debug information. This is inconsistent with the improved log level parsing approach and would cause debug logs to be missed if users specify "debug" (lowercase) in the environment variable.

The fix applies the same case-insensitive comparison approach by converting the environment variable value to uppercase before comparison. This ensures that debug logs will be properly emitted regardless of the case used in the environment variable (e.g., "debug", "Debug", or "DEBUG").

This change should be applied to all similar direct string comparisons in the codebase for consistency.

Suggested change
// Log error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Invalid logger type in goroutine map", "type", fmt.Sprintf("%T", logger))
}
// Log error in debug mode
if strings.ToUpper(os.Getenv("DIGGER_LOG_LEVEL")) == "DEBUG" {
slog.Debug("Invalid logger type in goroutine map", "type", fmt.Sprintf("%T", logger))
}
Comment on lines +201 to +207
// Always log cleanup activity in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Goroutine logger cleanup completed",
"cleaned", cleaned,
"remaining", totalEntries-cleaned,
"threshold_minutes", int(staleThreshold.Minutes()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another instance of case-sensitive log level checking in the cleanup function. The code directly compares the environment variable value to the exact string "DEBUG", which would fail to log debug information if users specify the log level in a different case format (e.g., "debug").

The fix applies the same case-insensitive comparison by converting the environment variable value to uppercase before comparison, ensuring consistent behavior regardless of the case used in the environment variable.

Suggested change
// Always log cleanup activity in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Goroutine logger cleanup completed",
"cleaned", cleaned,
"remaining", totalEntries-cleaned,
"threshold_minutes", int(staleThreshold.Minutes()))
}
// Always log cleanup activity in debug mode
if strings.ToUpper(os.Getenv("DIGGER_LOG_LEVEL")) == "DEBUG" {
slog.Debug("Goroutine logger cleanup completed",
"cleaned", cleaned,
"remaining", totalEntries-cleaned,
"threshold_minutes", int(staleThreshold.Minutes()))
}
Comment on lines +234 to +237
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the third instance of case-sensitive log level checking in the GetGoroutineID function. The code directly compares the environment variable value to the exact string "DEBUG", which would fail to log debug information if users specify the log level in a different case format.

The fix applies the same case-insensitive comparison by converting the environment variable value to uppercase before comparison, ensuring consistent behavior regardless of the case used in the environment variable.

Suggested change
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
// Log parsing error in debug mode
if strings.ToUpper(os.Getenv("DIGGER_LOG_LEVEL")) == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
Comment on lines +11 to +54
func Middleware() gin.HandlerFunc {
return func(c *gin.Context) {
rid := c.GetHeader("X-Request-ID")
if rid == "" {
rid = uuid.NewString()
c.Writer.Header().Set("X-Request-ID", rid)
}

// Protect against nil base logger
baseLogger := GetBaseLogger()
if baseLogger == nil {
// Fallback to default if initialization failed
baseLogger = slog.Default()
}

reqLog := baseLogger.With(
slog.String("request_id", rid),
slog.String("method", c.Request.Method),
slog.String("route", c.FullPath()),
slog.String("path", c.Request.URL.Path),
slog.String("ip", c.ClientIP()),
)

// Store in context
ctx := Inject(c.Request.Context(), reqLog)
c.Request = c.Request.WithContext(ctx)

// Store in goroutine map with error protection
gid := GetGoroutineID()
if gid != 0 { // Only store if we got a valid ID
StoreGoroutineLogger(gid, reqLog)
defer DeleteGoroutineLogger(gid) // Ensure cleanup
}

start := time.Now()
c.Next()

reqLog.Info("http_request_done",
slog.Int("status", c.Writer.Status()),
slog.Duration("latency", time.Since(start)),
slog.Int("bytes_out", c.Writer.Size()),
)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging middleware in backend/logging/middleware.go doesn't handle panics that might occur during request processing. This is a bug because when a panic occurs in a handler, the middleware won't be able to log the error before the panic propagates up the chain.

I've added a recovery mechanism that:

  1. Catches any panics that occur during request processing
  2. Logs the panic with detailed information including the stack trace
  3. Re-panics to allow the error to propagate up the middleware chain

This change ensures that all panics are properly logged with the request context before they're handled by other recovery middleware (like gin's built-in recovery). The modification includes importing the "runtime" package which is needed to capture the stack trace.

This is important for debugging and monitoring, as it ensures that even catastrophic errors are properly logged with their associated request context before the application potentially crashes or the error is handled by a higher-level recovery mechanism.

This comment has been minimized.

Comment on lines +65 to +70
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup ticker for goroutine loggers is started but never stopped, which could lead to a goroutine leak if the application is restarted frequently within the same process. While the ticker itself is stored in a package-level variable, there's no mechanism to stop it when the application shuts down.

This could be fixed by adding a shutdown function that stops the ticker, which should be called during application shutdown. The current code only creates the ticker but doesn't provide a way to clean it up.

Suggested change
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
// Create the cleanup ticker and store it for potential shutdown
cleanupTicker = time.NewTicker(10 * time.Minute)
go func() {
for range cleanupTicker.C {
cleanupStaleGoroutineLoggers()
}
}()
Comment on lines +222 to +240
func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]

// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses runtime.Stack to extract goroutine IDs, which is not an officially supported way to get goroutine IDs. This approach is brittle and could break in future Go versions as the stack trace format is not guaranteed to remain stable.

While there's currently no official Go API to get goroutine IDs, the code should at least document this limitation and mark it as a potential area for improvement when Go provides an official API. The added comment makes it clear that this is a workaround and should be revisited in the future.

Suggested change
func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]
// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
// GetGoroutineID extracts the goroutine ID from the stack trace.
// Note: This is not an officially supported way to get goroutine IDs and may break in future Go versions.
// TODO: Consider using a more reliable approach when Go provides an official API for this.
func GetGoroutineID() uint64 {
buf := make([]byte, goroutineStackBufferSize)
buf = buf[:runtime.Stack(buf, false)]
// Stack trace format: "goroutine 123 [running]:"
stack := string(buf)
if strings.HasPrefix(stack, "goroutine ") {
stack = stack[len("goroutine "):]
if idx := strings.Index(stack, " "); idx > 0 {
if id, err := strconv.ParseUint(stack[:idx], 10, 64); err == nil {
return id
}
// Log parsing error in debug mode
if os.Getenv("DIGGER_LOG_LEVEL") == "DEBUG" {
slog.Debug("Failed to parse goroutine ID", "stack", stack[:idx])
}
}
}
return 0 // fallback
@motatoes motatoes merged commit 92358f1 into develop Aug 22, 2025
12 checks passed
breardon2011 added a commit that referenced this pull request Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants