elk系列2之multiline模块的使用【转】

preface

上回说道了elk的安装以及kibana的简单搜索语法，还有logstash的input，output的语法，但是我们在使用中发现了一个问题，我们知道，elk是每一行为一个事件，像Java这类的输出日志，一个事件占用了好几行，导致elk在处理日志的时候，不能够识别多行日志为一个事件，所以我们在kibana上看的时候，明明是一个事件的日志，却拆分了好几段显示，所以我们针对这类一个事件占用多行的日志，得使用elk另一个模块，multiline模块，这个模块能够合并多行当作一个事件呢，使其在kibana显示的时候，同一个事件在一段。下面我们就看看elk更高级的用法。

下面的操作都是在linux-node1下操作

把多行日志合成一个事件

我们需要使用logstash的input里的multiline模块:https://www.elastic.co/guide/en/logstash/2.3/plugins-filters-multiline.html
参考万官方的配置文档以后，我们自己去配置一下，下面就看下配置文件：

[root@linux-node1 ~]# cat /etc/logstash/conf.d/codec.conf

input{

    stdin {

        codec => multiline{

            pattern => "^\["     # 正则表达式，匹配以[开头的行

            negate => true       # 布尔值，true表示启用，匹配上干活，

            what => "previous"   # what这参数意思是说匹配上[后，是取[前面的内容还是后面的内容，我这里写的previous，表示拿去[前面的内容。

        }

    }

}

filter{

}

output{

    stdout{

        codec => rubydebug

    }

}

配置完成后，启动服务：

[root@linux-node1 conf.d]# /opt/logstash/bin/logstash -f codec.conf

测试下，看是否能够匹配以[开头的行

[root@linux-node1 conf.d]# /opt/logstash/bin/logstash -f codec.conf

Settings: Default pipeline workers: 2

Pipeline main started

asdfasdf

xcfv

sdf

swe

rwer

asdf

[fasdfasdf               #只要以[开头的行，都匹配上去了。并且都是[前面的内容。

{

    "@timestamp" => "2016-12-08T23:25:14.350Z",

       "message" => "asdfasdf\nxcfv\nsdf\nswe\nrwer\nasdf",

      "@version" => "1",

          "tags" => [

        [0] "multiline"

    ],

          "host" => "linux-node1"

}

1111111111

2222222222222

33333333333

44444444444

[                     #只要以[开头的行，都匹配上去了。并且都是[前面的内容。

{

    "@timestamp" => "2016-12-08T23:25:25.374Z",

       "message" => "[fasdfasdf\n1111111111\n2222222222222\n33333333333\n44444444444",

      "@version" => "1",

          "tags" => [

        [0] "multiline"

    ],

          "host" => "linux-node1"

}

下面我们就通过这个模块来说说如何匹配java形式的日志

我们这里使用elasticsearch的日志来做分析，elasticsearch的日志形式就是java 的，它的日志路径是/var/log/elasticsearch/myes.log。

我们之前说过，logstash它会有一个文件专门存它需要读取哪些文件，所以再次启动之前，由于读取的是同一个文件（之前的配置文件也配置成了去读该文件），所以这里需要把logstash的sincedb数据库给删除，这样就能从头开始搜集了。同时在head模块里面，把对应的es-log索引也需要删除。

[root@linux-node1 logstash]# find / -name ".since*"

/root/.sincedb_a9b9fed7edff6fd888ffe131a05b5397

/root/.sincedb_1fb922e15ccea4ac0d028d33639ba3ea

/var/lib/logstash/.sincedb_a9b9fed7edff6fd888ffe131a05b5397

/var/lib/logstash/.sincedb_1fb922e15ccea4ac0d028d33639ba3ea

[root@linux-node1 logstash]# cat /root/.sincedb_a9b9fed7edff6fd888ffe131a05b5397

395042 0 2051 5972         #注意看第一列的数字，其实是日志文件的inode

395128 0 2051 5177

[root@linux-node1 logstash]# ls -li  /var/log/elasticsearch/myes.log      #-li参数是能够显示文本的inode节点

395042 -rw-r--r--. 1 root root 454310 Dec  9 07:02 /var/log/elasticsearch/myes.log          # 第一列就是这个文件的inode节点，和/root/.sincedb_1fb922e15ccea4ac0d028d33639ba3ea 一致。

注意，sincedb记录的是日志文件的inode。
删除elaticsaerch记录了myes.log的sincedb文件，同时也在head的web界面删除es-log的索引，让es重新开始搜集日志。

[root@linux-node1 ~]# rm -f /root/.sincedb_a9b9fed7edff6fd888ffe131a05b5397  # 找到记录了myes.log这个inode的sincedb数据库，删除即可

此时我们修改配置文件，使用multiline模块，配置文件请看下面：

[root@linux-node1 conf.d]# cat /etc/logstash/conf.d/codec.conf

input{

    file {

        path => ["/var/log/messages","/var/log/secure"]

        type => "system-log"

        start_position => "beginning"

    }

    file {

        path => "/var/log/elasticsearch/myes.log"   # 读取es的j

        type => "es-log"

        start_position => "beginning"

        codec => multiline {

            pattern => "^\["    # 正则匹配

            negate => true

            what => "previous"   # 使用[之前的

        }

    }

}

filter{

}

output{

    if [type] == "system-log" {

        elasticsearch {

            hosts => ["192.168.141.4:9200"]

            index => "system-log-%{+YYYY.MM}"

        }

    }

    if [type] == "es-log" {

        elasticsearch {

            hosts => ["192.168.141.4:9200"]

            index => "es-log-%{+YYYY.MM}"

        }

    }

}

确认无误后，我们启动logstash

[root@linux-node1 ~]# /opt/logstash/bin/logstash -f /etc/logstash/conf.d/codec.conf

然后我们同时重启elasticsearch在linux-node1和linux-node2上

[root@linux-node1 ~]# service  elasticsearch restart    # node1节点

[root@linux-node2 ~]# service  elasticsearch restart    # node2节点

我们重新打开head模块，访问地址是：http://192.168.141.3:9200/_plugin/head/
可以看到有es-log这个新的索引产生了

我们打开kibana页面，地址是http://192.168.141.3:5601

由于我们配置文件与上一篇博客写的不是一样的，上一篇是以年月日来作为索引，这里是使用年月来作为索引，所以这里需要重新增加一个索引。

然后切换到Discovery模版下，切换到es-log下面

此时我们可以看到java 的日志就是把多行合并成一行了，这样就一个事件一段了，不会把一个事件拆分多行显示了。

转自

elk系列2之multiline模块的使用 - 温柔易淡 - 博客园
http://www.cnblogs.com/liaojiafa/p/6155322.html