0

My lab has a few Docker containers as follows:

name Docker image
Fluentd fluent/fluentd:v1.16-1
Fluent-bit cr.fluentbit.io/fluent/fluent-bit
Loki grafana/loki
Grafana grafana/grafana-enterprise
Caddy caddy:builder

My goal is to collect Caddy logs and visualize them in Grafana.

Scenario: Fluent-bit tails the logs and sends them to Fluentd. Then Fluentd pushes the logs to Loki. My aim is to use Fluentd as the central log collector.

The problem is the parsing of those logs Grafana-side.

The Caddy logs are in (nested) JSON format. Sample:

 {"level":"info","ts":1712949034.535184,"logger":"http.log.access.log1","msg":"handled request","request":{"remote_ip":"172.18.0.1","remote_port":"39664","client_ip":"172.18.0.1","proto":"HTTP/1.1","method":"POST","host":"grafana.darknet.com","uri":"/api/short-urls","headers":{"Content-Length":["580"],"Origin":["http://grafana.darknet.com"],"Content-Type":["application/json"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64; rv:124.0) Gecko/20100101 Firefox/124.0"],"Accept":["application/json, text/plain, */*"],"X-Grafana-Org-Id":["1"],"Connection":["keep-alive"],"Accept-Language":["en-US,en;q=0.5"],"Accept-Encoding":["gzip, deflate"],"Referer":["http://grafana.darknet.com/explore?schemaVersion=1&panes=%7B%22Efb%22:%7B%22datasource%22:%22f779c221-7bd2-468d-9f9c-96e069b869f8%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bjob%3D%5C%22caddy.log.loki%5C%22%7D%20%7C%20json%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22f779c221-7bd2-468d-9f9c-96e069b869f8%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-1m%22,%22to%22:%22now%22%7D%7D%7D&orgId=1"],"X-Grafana-Device-Id":["f343e938e74b3a57997faff69d24de8a"],"Cookie":[]}},"bytes_read":580,"user_id":"","duration":0.011267887,"size":72,"status":200,"resp_headers":{"X-Xss-Protection":["1; mode=block"],"Date":["Fri, 12 Apr 2024 19:10:34 GMT"],"Content-Length":["72"],"Server":["Caddy"],"Cache-Control":["no-store"],"Content-Type":["application/json"],"X-Content-Type-Options":["nosniff"],"X-Frame-Options":["deny"]}} 

I have tried two different configurations so far:

  1. Have Fluent-bit send the logs to Fluentd, then Fluentd forwards the logs to Loki (tagged as caddy.log)
    Schema: Cady --> Fluent-bit --> Fluentd --> Loki

  2. Have Fluent-bit send the logs straight to Loki (tagged as caddy.log.loki)
    Schema: Cady --> Fluent-bit --> Loki

Here I have the following Fluent-bit config to send logs to both Loki and Fluentd at the same time, with different tags:

[INPUT] Name tail Path /var/log/caddy/*.log Parser json Tag caddy.log Path_Key log_filename # send logs to Fluentd [OUTPUT] Name forward Host fluentd Port 24224 Match caddy.* # send logs straight to Loki [OUTPUT] name loki match caddy.* host loki port 3100 labels job=caddy.log.loki 

Fluentd config:

<source> @type forward </source> <match caddy.*> @type loki url "http://loki:3100" extra_labels {"job": "caddy.log"} <buffer> flush_interval 5s flush_at_shutdown true </buffer> </match> 

Then in Grafana I can browse the logs and I have the two labels available in the Explore window.

If I choose the tag caddy.log.loki the logs are displayed in plain JSON as shown below. With this expression I can parse them: {job="caddy.log.loki"} | json. Some of the nested JSON is extracted eg: request_client_ip but not all of it, for example request.headers is missing but I can live with that.

logs seen as JSON

If I choose the tag caddy.log then the logs are displayed in a "mixed" format:

logs in mixed format

It appears that some transformation took place but I am not sure where. I can use logfmt to parse the lines. But I am still left with some unparsed fields (request, resp_headers) as shown below:

after using logfmt

Questions:

  • why is that the logs are not rendered in plain JSON anymore if I add the Fluentd step?
  • what would be the best way to ship and parse nested JSON logs in Loki/Grafana with Fluentd?

1 Answer 1

2

why is that the logs are not rendered in plain JSON anymore if I add the Fluentd step?

According to the fluentd loki output plugin docs, the default line_format is key_value. You did not specify the format in your fluentd configuration, so the logs are not in JSON but in <key>=<value> format.

what would be the best way to ship and parse nested JSON logs in Loki/Grafana with Fluentd?

You can try adding Nest filter to your fluentbit configuration:

[SERVICE] parsers_file parsers.conf [INPUT] Name tail Path /var/log/caddy/*.log Parser json Tag caddy.log Path_Key log_filename [FILTER] Name nest Match caddy.* Operation lift Nested_under request # Nest filter can be chained [FILTER] Name nest Match caddy.* Operation lift Nested_under headers Add_prefix req_ [FILTER] Name nest Match caddy.* Operation lift Nested_under resp_headers Add_prefix resp_ # send logs to Fluentd [OUTPUT] Name forward Host fluentd Port 24224 Match caddy.* # send logs straight to Loki [OUTPUT] name loki match caddy.* host loki port 3100 labels job=caddy.log.loki 

Result:

Grafana explore output

Note that LogQL json parser without parameters will skip arrays (https://grafana.com/docs/loki/latest/query/log_queries/#json), so if you want fields with arrays you have to specify it in the parameters, for example:

{job="caddy.log.loki"} |= `` | json request_user_agent=`["req_User-Agent"]` 

LogQL JSON parser with parameters

1
  • line_format json indeed did the trick. Now the logs are arriving as JSON after being forwarded by Fluentd. The nested JSON is also being parsed partially, for example request_client_ip is available straight out of the box. Extracting the array values like the headers would probably take a few filter and parser steps but I am already happy with what I have. I also added TLS transport between Fluent-bit and Fluentd. Thanks a lot. Commented Apr 14, 2024 at 14:07

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.