Menu

Log Management and Analytics

Relevant source files

This document describes the log management and analytics subsystem of Nginx UI, which provides automated log file discovery, high-performance indexing, full-text search, and analytics capabilities for Nginx access and error logs.

Scope: This page covers the overall architecture, service initialization, log file management, and data flow through the log processing pipeline. For detailed information about the indexing internals, see Log Indexing System. For search and analytics features, see Search and Analytics. For real-time communication, see WebSocket Services.

System Architecture

The log management system consists of modular services that handle log discovery, indexing, searching, and analytics. The architecture uses global singleton instances initialized through InitializeServices() in internal/nginx_log/modern_services.go

Core Architecture with Code Entities

Key Design Patterns:

  • Singleton Services: Global instances (globalIndexer, globalSearcher, etc.) initialized once via InitializeServices()
  • Worker Pool: ParallelIndexer maintains workers []*indexWorker that process jobs from jobQueue
  • Grouped Sharding: GroupedShardManager creates one Bleve index per log group using main_log_path field
  • Zero-Downtime Updates: SwapShards() uses bleve.IndexAlias for atomic shard replacement
  • Event-Driven UI: WebSocket events (TypeNginxLogIndexProgress, TypeNginxLogIndexReady) provide real-time updates

Sources:

Service Initialization

The log management system initializes conditionally based on settings.NginxLogSettings.IndexingEnabled. When enabled, it creates and starts all modular services with proper lifecycle management.

Service Initialization Flow

Initialization Sequence

The InitializeServices() function (internal/nginx_log/modern_services.go44-87) performs these steps:

  1. Enable Check: Exits early if settings.NginxLogSettings.IndexingEnabled is false
  2. Duplicate Check: Returns if servicesInitialized is already true
  3. Context Creation: Creates cancellable context via context.WithCancel(ctx)
  4. Component Initialization: Calls initializeWithDefaults(serviceCtx) (internal/nginx_log/modern_services.go90-129):
    initializeWithDefaults() ├─> indexer.InitLogParser() // Global parser singleton ├─> searcher.NewSearcher(config, []bleve.Index{}) // Empty searcher ├─> analytics.NewService(globalSearcher) // Analytics service ├─> indexer.NewGroupedShardManager(config) // Shard manager ├─> indexer.NewParallelIndexer(config, manager) // Indexer with workers ├─> globalIndexer.Start(ctx) // Start worker pool ├─> indexer.NewLogFileManager() // Log file tracker ├─> manager.SetIndexer(globalIndexer) // Inject indexer reference └─> updateSearcherShardsLocked() // Load existing shards 
  5. Task Scheduling: Starts InitTaskScheduler(serviceCtx) in goroutine for background job processing
  6. Shutdown Monitoring: Goroutine monitors serviceCtx.Done() and calls StopServices() for cleanup

Index Path Configuration

The index storage location is determined by getConfigDirIndexPath() (internal/nginx_log/modern_services.go132-162):

  1. Custom Path: Uses settings.NginxLogSettings.IndexPath if configured
  2. Config Directory: Uses <config_dir>/log-index where <config_dir> is from cSettings.ConfPath
  3. Fallback: Uses ./log-index in current working directory

Fallback Storage When Indexing Disabled

When IndexingEnabled is false, log discovery still works using in-memory storage:

The legacy API functions (internal/nginx_log/modern_services.go256-317) check GetLogFileManager() == nil and fall back to fallbackCache for operations like AddLogPath(), GetAllLogPaths(), and RemoveLogPathsFromConfig().

Sources:

Log File Discovery and Management

The LogFileManager discovers and tracks Nginx log files through integration with the configuration scanner:

NginxLogCache Structure

The NginxLogCache struct (internal/nginx_log/indexer/log_file_manager.go20-26) represents a discovered log file from nginx configuration:

This is distinct from NginxLogWithIndex (internal/nginx_log/indexer/log_file_manager.go28-50), which includes indexing metadata:

FieldTypeDescription
PathstringAbsolute path to log file
Typestring"access" or "error"
NamestringDisplay name
ConfigFilestringSource nginx config file
IndexStatusstring"indexed", "indexing", "queued", "not_indexed", "error"
LastModifiedint64Unix timestamp of file modification
LastSizeint64File size in bytes
LastIndexedint64Unix timestamp of last successful index
IndexStartTimeint64Unix timestamp when indexing started
IndexDurationint64Duration in milliseconds
DocumentCountuint64Number of indexed documents
TimeRangeStartint64Earliest log entry timestamp
TimeRangeEndint64Latest log entry timestamp
ErrorMessagestringError details if status is "error"
QueuePositionintPosition in indexing queue

Sources:

Log File Status Lifecycle

Each log file progresses through distinct indexing states tracked in the model.NginxLogIndex database table:

Log File Status State Machine

IndexStatus Enum Values

The IndexStatus type is defined in internal/nginx_log/indexer/types.go13-22:

StatusValueDescription
IndexStatusNotIndexed"not_indexed"File discovered but not yet processed
IndexStatusQueued"queued"Waiting in the indexer's job queue
IndexStatusIndexing"indexing"Currently being processed by a worker
IndexStatusIndexed"indexed"Successfully indexed with complete metadata
IndexStatusError"error"Failed with error message stored

Status Transitions

The PersistenceManager updates status through these key methods:

Log Grouping and Rotation Handling

The LogFileManager.GetAllLogsWithIndexGrouped() method aggregates rotated files by their base log path (internal/nginx_log/indexer/log_file_manager.go187-426):

  • Detects rotation patterns: access.log.1, access.log.2.gz, access.log.20240101
  • Groups files by getBaseLogName() which strips rotation suffixes
  • Aggregates DocumentCount and time ranges across all files in the group
  • Returns a single entry per log group with combined statistics

Sources:

Indexing Pipeline Architecture

The indexing pipeline uses a parallel worker pool with grouped sharding and SIMD-optimized parsing. The pipeline processes log groups (base log files and their rotated variants) through multiple stages: file discovery, parsing, document routing, and Bleve indexing. Detailed implementation is covered in page 8.1 (Parallel Indexing Pipeline) and 8.2 (Search and Query System).

Indexing Pipeline Data Flow

Key Components

ParallelIndexer Structure (internal/nginx_log/indexer/parallel_indexer.go18-52)

The ParallelIndexer struct maintains the core indexing state:

Worker Pool Processing (internal/nginx_log/indexer/parallel_indexer.go940-977)

Each indexWorker runs in its own goroutine:

  1. Blocks on jobQueue channel
  2. Updates status to WorkerStatusBusy
  3. Calls processJob() to:
    • Group documents by mainLogPath then by shardID
    • Call GetShardForDocument() to route to correct shard
    • Index documents via indexShardDocuments()
  4. Sends IndexResult to resultQueue
  5. Executes job callback with error status
  6. Updates status back to WorkerStatusIdle

SIMD-Optimized Parser (internal/nginx_log/indexer/parser.go59-76)

The global logParser singleton uses optimized parsing:

  • Initialized once via InitLogParser() (internal/nginx_log/indexer/parser.go24-52)
  • Uses parser.StreamParse() for 7-8x faster batch processing
  • Achieves 70% memory reduction through zero-allocation techniques
  • Processes logs in chunks with configurable BatchSize (default 15000)

Grouped Shard Manager (internal/nginx_log/indexer/parallel_indexer.go104-108)

Routes documents by mainLogPath instead of individual file paths:

  • Creates one shard per log group (e.g., /var/log/nginx/access.log and all its rotated files share a shard)
  • Implements GetShardForDocument(mainLogPath, docID) method
  • Maintains shardsByMainLogPath map[string]bleve.Index
  • Enables efficient queries across all rotated files in a log group

Sources:

Search and Analytics Architecture

The search and analytics layer provides full-text search, faceted filtering, and aggregated statistics with HyperLogLog cardinality counting for accurate UV metrics. Detailed coverage is in Search and Analytics.

Search and Analytics Component Architecture

Search Request Structure

The SearchRequest struct (internal/nginx_log/searcher/types.go48-95) specifies search parameters:

FieldTypeDescription
QuerystringFull-text search query string
Fields[]stringFields to retrieve in results
LogPaths[]stringFilter by specific log file paths
UseMainLogPathboolUse main_log_path field for log group queries
StartTime*int64Unix timestamp start (inclusive)
EndTime*int64Unix timestamp end (inclusive)
IPAddresses[]stringFilter by IP addresses
Methods[]stringFilter by HTTP methods
StatusCodes[]intFilter by HTTP status codes
Paths[]stringFilter by request paths
UserAgents[]stringFilter by user agent strings
Referers[]stringFilter by referer headers
Countries[]stringFilter by country codes
Browsers[]stringFilter by browser names
OSs[]stringFilter by operating systems
Devices[]stringFilter by device types
MinBytes*int64Minimum bytes sent
MaxBytes*int64Maximum bytes sent
MinReqTime*float64Minimum request time
MaxReqTime*float64Maximum request time
LimitintPage size
OffsetintPage offset
SortBystringField to sort by
SortOrderstring"asc" or "desc"
IncludeHighlightingboolEnable search term highlighting
IncludeFacetsboolRequest faceted aggregations
FacetFields[]stringFields to facet on
FacetSizeintNumber of terms per facet (default 10)
IncludeStatsboolInclude statistical aggregations
Timeouttime.DurationRequest timeout
UseCacheboolEnable result caching
CacheKeystringCustom cache key

Analytics Methods and Metrics

The analytics.Service interface (internal/nginx_log/analytics/service.go12-31) provides these analysis methods:

MethodPurposeReturn Type
GetDashboardAnalytics(ctx, req)Comprehensive dashboard stats*DashboardAnalytics
GetLogEntriesStats(ctx, req)General log statistics*EntriesStats
GetGeoDistribution(ctx, req)Geographic access patterns*GeoDistribution
GetTopPaths(ctx, req)Most accessed URLs[]KeyValue
GetTopIPs(ctx, req)Most active IP addresses[]KeyValue
GetTopUserAgents(ctx, req)Most common user agents[]KeyValue
GetTopCountries(ctx, req)Most active countries[]CountryStats
GetTopCities(ctx, req)Most active cities[]CityStats

Cardinality Counting with HyperLogLog

The analytics service uses HyperLogLog for efficient unique value counting (internal/nginx_log/analytics/service.go66-85):

Accuracy Comparison:

MetricFacet-BasedHyperLogLog-BasedUse Case
UV (Unique IPs)Limited by FacetSize~2% error rateUsed when facets would undercount
PV (Total Hits)Exact via TotalHitsN/ADirect from search results
Unique PagesLimited by FacetSize~2% error rateURL diversity metrics
Browser/OS StatsExact via facetsN/AFacets sufficient for categories
Geo DistributionExact via facetsN/AFacets sufficient for regions

The Counter.CountCardinality() method enables accurate UV metrics even when the number of unique IPs exceeds facet size limits (typically 50-1000).

Dashboard Analytics Structure (internal/nginx_log/analytics/dashboard.go14-100)

GetDashboardAnalytics() returns:

  • Summary: Total PV, UV, traffic, unique pages
  • HourlyStats[]: 48 hours of access statistics
  • DailyStats[]: Daily aggregated metrics
  • TopURLs[]: Most accessed paths with hit counts
  • Browsers[]: Browser usage distribution
  • OperatingSystems[]: OS usage distribution
  • Devices[]: Device type breakdown

Sources:

Real-Time Monitoring and Events

The log management system publishes WebSocket events via the event package to provide real-time progress updates during indexing operations.

WebSocket Event Flow

Event Types and Structures

The system uses typed event constants from the event package:

TypeNginxLogIndexProgress (api/nginx_log/index_management.go168-179)

Published during indexing to show real-time progress:

TypeNginxLogIndexReady (api/nginx_log/index_management.go234-242)

Published when indexing completes and searcher is updated:

processing_status (api/nginx_log/index_management.go125)

Global indexing status via ProcessingStatusManager:

Frontend Event Subscription

The frontend subscribes using useWebSocketEventBus() composable (app/src/views/nginx_log/NginxLogList.vue90-130):

Subscribe to Processing Status:

Subscribe to Index Ready:

Progress Tracking with useIndexProgress:

The composable maintains a reactive map of file progress (app/src/views/nginx_log/NginxLogList.vue35-36):

Sources:

Data Storage and Organization

The log management system uses multiple storage layers with grouped sharding for efficient log group queries.

Storage Architecture

Bleve Index Organization

Index Path Configuration (internal/nginx_log/modern_services.go132-162)

The index path is determined by:

  1. Custom path from settings.NginxLogSettings.IndexPath if configured
  2. Otherwise: <config_dir>/log-index/ where <config_dir> is from cSettings.ConfPath
  3. Fallback: ./log-index in current directory

Grouped Sharding Strategy (internal/nginx_log/indexer/parallel_indexer.go74-80)

The GroupedShardManager creates one Bleve index per log group:

  • Key: main_log_path (e.g., /var/log/nginx/access.log)
  • Value: Single bleve.Index containing all documents from that log group
  • Benefits:
    • Efficient queries across all rotated files (access.log, access.log.1, access.log.2.gz, etc.)
    • Reduced shard count compared to per-file sharding
    • Simplified index management

Document Mapping

The Bleve index mapping is defined in CreateLogIndexMapping() (internal/nginx_log/indexer/types.go340-456). Key fields:

FieldTypeAnalyzerDocValuesPurpose
timestampNumeric--Time-range queries
ipTextkeywordYesExact IP matching, UV counting
main_log_pathTextkeywordYesLog group filtering
file_pathTextkeyword-Actual physical file path
methodTextkeyword-HTTP method filtering
path_exactTextkeywordYesExact path matching, faceting
statusNumeric--Status code filtering
browserTextkeyword-Browser analytics
osTextkeyword-OS analytics
device_typeTextkeyword-Device type analytics
region_codeTextkeyword-Geographic filtering

Database Schema: model.NginxLogIndex

The nginx_log_index table tracks indexing metadata (internal/nginx_log/indexer/persistence.go69-124):

ColumnTypeDescription
pathvarchar(512)Primary key: Log file path
main_log_pathvarchar(512)Base log path for grouping rotated files
index_statusvarchar(20)"indexed", "indexing", "queued", "error", "not_indexed"
document_countbigintTotal indexed documents
last_indexeddatetimeTimestamp of last successful index
index_start_timedatetimeWhen last indexing operation started
index_durationbigintMilliseconds to complete last indexing
timerange_startdatetimeEarliest timestamp in indexed logs
timerange_enddatetimeLatest timestamp in indexed logs
last_modifieddatetimeFile modification time at last index
last_sizebigintFile size in bytes at last index
error_messagetextError description if status is "error"
error_timedatetimeWhen error occurred
retry_countintNumber of retry attempts
queue_positionintPosition in indexing queue
enabledbooleanWhether indexing is enabled for this file

Key Operations:

Sources:

System Requirements and Configuration

The advanced indexing system has minimum and recommended hardware requirements:

ResourceMinimumRecommendedPurpose
CPU Cores12+Worker parallelism
RAM1GB4GB+Memory pooling and batch processing
Disk Space20GB-Index storage (varies with log volume)

Configuration options (internal/nginx_log/indexer/types.go116-200):

The system automatically adjusts worker count and batch sizes based on available system resources and indexing performance.

Sources: