File log receiver
Serverless Observability Stack EDOT Collector
The file log receiver ingests logs from local files and forwards them for processing and exporting. It is a versatile and widely-used component for collecting logs in file-based formats, such as container logs, system logs, or custom application logs.
The receiver supports multiline parsing, filtering, persistent tracking of file offsets, and routing based on file metadata or log content.
The filelogreceiver
is included by default in the EDOT Collector's Kubernetes Helm chart. It is preconfigured to collect logs from pod log files such as /var/log/pods/*/*/*.log
, and uses the file_storage
extension to track read positions and avoid duplicate ingestion after restarts.
To view or customize the default configuration, refer to:
The following example shows how to ingest Kubernetes container logs with routing logic based on log format:
receivers: filelog: include: - /var/log/pods/*/*/*.log exclude: - /var/log/pods/*/*/*.gz start_at: beginning include_file_name: true include_file_path: true fingerprint_size: 100 max_log_size: 102_400 storage: file_storage operators: - type: router routes: - output: parser_containerd expr: 'body matches "^\\d{4}-\\d{2}-\\d{2}T"' - output: parser_crio expr: 'body matches "^[A-Z][a-z]{2} [0-9]{1,2} "' - id: parser_containerd type: json_parser timestamp: parse_from: attributes.time layout_type: gotime layout: 2006-01-02T15:04:05.000000000Z07:00 - id: parser_crio type: regex_parser regex: '^(?P<time>[^ ]+ [^ ]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) (?P<body>.*)' timestamp: parse_from: attributes.time layout: '%b %d %H:%M:%S'
The following are some of the most commonly used settings when working with the file log receiver. These options help control what files are read, how logs are parsed, and how file positions are tracked between restarts:
Option | Description |
---|---|
include | List of glob patterns for files to include. |
exclude | Optional glob patterns for files to exclude (for example rotated or compressed files). |
start_at | beginning or end . Controls where to start reading files when no checkpoint exists. |
operators | Parsing and routing logic for logs. |
storage | Enables persistent tracking of file positions using a storage extension. |
max_log_size | Maximum size of individual log entries (in bytes). |
fingerprint_size | Size (in bytes) used to identify and deduplicate files. |
For the full list of options, refer to the upstream filelogreceiver
documentation.
These tips can help you get the most out of the file log receiver:
Use persistent storage to avoid duplicates
Without persistent storage, the receiver will not retain file read positions across restarts. This can result in either duplicate ingestion (if start_at
is set to beginning
) or lost logs (if set to end
). Use the storage:
setting and configure a persistent volume when running in Kubernetes.
The default configuration excludes rotated files, which helps prevent duplicate ingestion. If you need to include rotated logs, update the include:
and exclude:
patterns accordingly.
Multiline log support is not enabled by default. To handle multi-line messages such as stack traces, define a regex_parser
, combine_logs
, or multiline
operator in your Helm chart configuration.
If your environment produces logs in multiple formats (for example containerd and CRI-O), use the router
operator to apply appropriate parsers based on the log structure.
Using start_at: beginning
without a storage extension will re-read all files from the start after each restart, which might lead to duplicate log entries.
Like any component, file log receiver has some trade-offs and behaviors to be aware of, especially in Kubernetes environments:
Persistent log tracking requires explicit
storage:
configuration and persistent volume support in Kubernetes.Multiline logs are not parsed by default. You must customize the configuration to parse them.
Incorrect include/exclude globs can result in missing rotated logs or unintended ingestion.
High-volume directories might require tuning of
max_concurrent_files
.