Multi-line logs (e.g. exception messages) provide a lot of very valuable information for debugging application problems, and today’s distributed microservices are popular basically collect the logs uniformly, such as the common ELK, EFK and other schemes, but these schemes do not look at multi-line logs as a whole without proper configuration, but each line is treated as a separate line of logs. This is difficult for us to accept.
In this article, we will describe the strategies of some common log collection tools for handling multi-line logs.
JSON
The easiest way to ensure that multiple lines of logging are processed as a single event is to record the log in JSON format, such as the following example of a regular Java daily log.
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# javaApp.log
2019-08-14 14:51:22,299 ERROR [http-nio-8080-exec-8] classOne: Index out of range
java.lang.StringIndexOutOfBoundsException: String index out of range: 18
at java.lang.String.charAt(String.java:658)
at com.example.app.loggingApp.classOne.getResult(classOne.java:15)
at com.example.app.loggingApp.AppController.tester(AppController.java:27)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
[...]
|
If the above logs are collected directly they will be recognized as multi-line logs, it is much easier to record these logs in JSON format and then introduce the JSON data, for example by using Log4J2 to record them in the following format.
1
2
3
|
{"@timestamp":"2019-08-14T18:46:04.449+00:00","@version":"1","message":"Index out of range","logger_name":"com.example.app.loggingApp.classOne","thread_name":"http-nio-5000-exec-6","level":"ERROR","level_value":40000,"stack_trace":"java.lang.StringIndexOutOfBoundsException: String index out of range: 18\n\tat java.lang.String.charAt(String.java:658)\n\tat com.example.app.loggingApp.classOne.getResult(classOne.java:15)\n\tat com.example.app.loggingApp.AppController.tester(AppController.java:27)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)\n\tat org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)\n\tat
[...]
}
|
This entire log message is contained in a single JSON object summary, which contains the complete exception stack information, most tools support direct parsing of JSON log data, which is the simplest way for the same operation and maintenance is also the most worry-free, but most developers are resistant to using JSON format to record logs ~ ~ ~ ~
Logstash
For those using Logstash, it is not difficult to support multi-line logs, Logstash can parse multi-line logs using a plugin that is configured in the input
section of the logging pipeline. For example, the following configuration indicates that Logstash is told to match the timestamp of the ISO8601
format in your log file, and when it matches that timestamp, it collapses everything that did not previously begin with the timestamp into the previous log entry.
1
2
3
4
5
6
7
8
9
10
11
|
input {
file {
path => "/var/app/current/logs/javaApp.log"
mode => "tail"
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => "previous"
}
}
}
|
Fluentd
Similar to Logstash, Fluentd allows us to use a plugin to handle multi-line logs, and we can configure the plugin to accept one or more regular expressions, as exemplified by the following Python multi-line log
1
2
3
4
5
|
2019-08-01 18:58:05,898 ERROR:Exception on main handler
Traceback (most recent call last):
File "python-logger.py", line 9, in make_log
return word[13]
IndexError: string index out of range
|
Without the multiline
multiline parser, Fluentd will treat each line as a complete log. We can add a multiline
parsing rule to the <source>
module, which must include a format_firstline
parameter to specify what a new log entry starts with, in addition to You can use regular grouping and capturing to parse the attributes in the log, as configured below.
1
2
3
4
5
6
7
8
9
10
|
<source>
@type tail
path /path/to/pythonApp.log
tag sample.tag
<parse>
@type multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /(?<timestamp>[^ ]* [^ ]*) (?<level>[^\s]+:)(?<message>[\s\S]*)/
</parse>
</source>
|
In the parsing section we specified the multiline parser using @type multiline
, then used format_firstline
to specify our rules for the beginning of the multiline log, here we just used a simple regular match date, then specified the matching pattern for the other sections and assigned labels to them, here we split the log into timestamp
, level
, message
fields.
After parsing the rules above, Fluentd will now see each traceback log as a single log.
1
2
3
4
5
|
{
"timestamp": "2019-08-01 19:22:14,196",
"level": "ERROR:",
"message": "Exception on main handler\nTraceback (most recent call last):\n File \"python-logger.py\", line 9, in make_log\n return word[13]\nIndexError: string index out of range"
}
|
The log has been formatted as JSON and the tag we matched has been set to Key.
There are also several example instructions in the official Fluentd documentation.
Rails Logs
For example, the entered Rails logs look like this.
1
2
3
4
5
|
Started GET "/users/123/" for 127.0.0.1 at 2013-06-14 12:00:11 +0900
Processing by UsersController#show as HTML
Parameters: {"user_id"=>"123"}
Rendered users/show.html.erb within layouts/application (0.3ms)
Completed 200 OK in 4ms (Views: 3.2ms | ActiveRecord: 0.0ms)
|
We can use the following parsing configuration for multi-line matching.
1
2
3
4
5
6
7
8
9
|
<parse>
@type multiline
format_firstline /^Started/
format1 /Started (?<method>[^ ]+) "(?<path>[^"]+)" for (?<host>[^ ]+) at (?<time>[^ ]+ [^ ]+ [^ ]+)\n/
format2 /Processing by (?<controller>[^\u0023]+)\u0023(?<controller_method>[^ ]+) as (?<format>[^ ]+?)\n/
format3 /( Parameters: (?<parameters>[^ ]+)\n)?/
format4 / Rendered (?<template>[^ ]+) within (?<layout>.+) \([\d\.]+ms\)\n/
format5 /Completed (?<code>[^ ]+) [^ ]+ in (?<runtime>[\d\.]+)ms \(Views: (?<view_runtime>[\d\.]+)ms \| ActiveRecord: (?<ar_runtime>[\d\.]+)ms\)/
</parse>
|
The logs obtained after parsing are shown below.
1
2
3
4
5
6
7
8
9
10
|
{
"method" :"GET",
"path" :"/users/123/",
"host" :"127.0.0.1",
"controller" :"UsersController",
"controller_method":"show",
"format" :"HTML",
"parameters" :"{ \"user_id\":\"123\"}",
...
}
|
Java Stack Logging
For example, the log we are now parsing is shown below.
1
2
3
4
5
|
2013-3-03 14:27:33 [main] INFO Main - Start
2013-3-03 14:27:33 [main] ERROR Main - Exception
javax.management.RuntimeErrorException: null
at Main.main(Main.java:16) ~[bin/:na]
2013-3-03 14:27:33 [main] INFO Main - End
|
Then we can use the following parsing rule for multi-line matching.
1
2
3
4
5
|
<parse>
@type multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
</parse>
|
The parsed log is.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
{
"thread" :"main",
"level" :"INFO",
"message":" Main - Start"
}
{
"thread" :"main",
"level" :"ERROR",
"message":" Main - Exception\njavax.management.RuntimeErrorException: null\n at Main.main(Main.java:16) ~[bin/:na]"
}
{
"thread" :"main",
"level" :"INFO",
"message":" Main - End"
}
|
In addition to format_firstline
, which specifies the start line of the multi-line log, the above multi-line parsing configuration also uses format1
, format2
… formatN
, where N ranges from 1...20
, which is the Regexp format list of the multi-line log. .20, which is a list of Regexp formats for multi-line logs. For the sake of the band, you can split the Regexp pattern into multiple regexpN
parameters, and connect these matching patterns to construct a regular match for multi-line patterns.
Fluent Bit
Fluent Bit’s tail input
plugin also provides configuration options for handling multi-line logs, for example, let’s now work with the previous Python multi-line logs.
1
2
3
4
5
|
2019-08-01 18:58:05,898 ERROR:Exception on main handler
Traceback (most recent call last):
File "python-logger.py", line 9, in make_log
return word[13]
IndexError: string index out of range
|
If we don’t use the multiline parser Fluent Bit will also treat each line as a log, and we can configure the multiline log to be structured using Fluent Bit’s built-in regex parser plugin.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
[PARSER]
Name log_date
Format regex
Regex /\d{4}-\d{1,2}-\d{1,2}/
[PARSER]
Name log_attributes
Format regex
Regex /(?<timestamp>[^ ]* [^ ]*) (?<level>[^\s]+:)(?<message>[\s\S]*)/
[INPUT]
Name tail
tag sample.tag
path /path/to/pythonApp.log
Multiline On
Parser_Firstline log_date
Parser_1 log_attributes
|
Similar to Fluentd, the Parser_Firstline
parameter specifies the name of the parser that matches the beginning of a multi-line log, although we can include additional parsers to further structure your logs. Here we have configured the Parser_Firstline
parameter to first match log lines starting with the ISO8601
date, and then used the Parser_1
parameter to specify a matching pattern to match the rest of the log messages, assigning them the timestamp
, level
, and message
tags.
After the final conversion our logs become in the format shown below.
1
2
3
4
5
|
{
"timestamp": "2019-08-01 19:22:14,196",
"level": "ERROR:",
"message": "Exception on main handler\nTraceback (most recent call last):\n File \"python-logger.py\", line 9, in make_log\n return word[13]\nIndexError: string index out of range"
}
|