- Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Description
Bug Report
Describe the bug
When using the forward input plugin, we are able to consistently trigger a segfault when using a low Mem_Buf_Limit value. From our testing it appears to happen when Mem_Buf_Limit is less than the forward input plugin's Buffer_Max_Size value (default 6mb). EDIT: after confirming with valgrind, the bug doesnt appear to be related to exact buffer sizes, only the frequency and duration of the input pauses.
To Reproduce
seg-fault 1:
[2025/11/05 17:48:47.746440550] [ info] [input] forward.3 resume (mem buf overlimit - buf size 2053099B now below limit 4000000B) [2025/11/05 17:48:47.748714140] [ warn] [input] forward.3 paused (mem buf overlimit - event of size 5579B exceeded limit 4000000 to 4000195B) [2025/11/05 17:48:47.748742248] [ info] [input] pausing forward.3 [2025/11/05 17:48:47] [engine] caught signal (SIGSEGV) #0 0xaaaad1ba66c7 in fw_conn_event() at plugins/in_forward/fw_conn.c:75 #1 0xaaaad1aeff03 in flb_engine_start() at src/flb_engine.c:1136 #2 0xaaaad1acab63 in flb_lib_worker() at src/flb_lib.c:835 #3 0xffff9af7202f in start_thread() at reate.c:442 #4 0xffff9afdbf1b in thread_start() at sysv/linux/aarch64/clone.S:79 #5 0xffffffffffffffff in ???() at ???:0 seg-fault 2:
[2025/11/05 20:07:54.252168021] [ info] [input] forward.0 resume (mem buf overlimit - buf size 1955268B now below limit 4000000B) [2025/11/05 20:07:54.649327921] [ warn] [input] forward.0 paused (mem buf overlimit - event of size 7323B exceeded limit 4000000 to 4005721B) [2025/11/05 20:07:54.649336291] [ info] [input] pausing forward.0 [2025/11/05 20:07:54] [engine] caught signal (SIGSEGV) #0 0x7f8810d91e55 in ???() at ???:0 #1 0x98aca9 in receiver_recv() at plugins/in_forward/fw_prot.c:1086 #2 0x98ad5b in receiver_to_unpacker() at plugins/in_forward/fw_prot.c:1102 #3 0x98b795 in fw_prot_process() at plugins/in_forward/fw_prot.c:1291 #4 0x9831d4 in fw_conn_event() at plugins/in_forward/fw_conn.c:126 #5 0x54981a in flb_engine_start() at src/flb_engine.c:1136 #6 0x4cbe48 in flb_lib_worker() at src/flb_lib.c:904 #7 0x7f8810c8b2e9 in ???() at ???:0 #8 0x7f8810d104ff in ???() at ???:0 #9 0xffffffffffffffff in ???() at ???:0 - Steps to reproduce the problem:
- Fluent-bit config file:
[SERVICE] Flush 1 [INPUT] Name forward Mem_Buf_Limit 4MB unix_path /var/run/fluent.sock [OUTPUT] Name flowcounter Match * Unit second [OUTPUT] Name file Match * Path /home/ec2-user/fluent - Build and run latest fluent-bit binary with debug/backtrace flags enabled:
cmake -DFLB_RELEASE=Off -DFLB_MTRACE=On -DFLB_DEV=On -DFLB_BACKTRACE=On -DFLB_DEBUG=On -DFLB_TRACE=On -DFLB_VALGRIND=On .. make -j $(nproc) bin/fluent-bit -c fluent.conf - Run high load log generator script on the socket. I am using a docker container to do this like this
docker run -d --log-driver=fluentd --log-opt mode=non-blocking --log-opt fluentd-address=unix:///var/run/fluent.sock -e LOG_RATE_PER_SECOND=1000 -e LOG_SIZE_KB=7 public.ecr.aws/cssparr/loggen:latest - Observe the segfault log message that occurs above.
Expected behavior
no segfault
Your Environment
see config and image/version used in steps to repro above