logstash的替代软件有很多,常用有的fluentd和Filebeat,这里主要以fluentd替换logstash进行日志采集。elasticsearch和kibana安装可以参考ELK日志分析平台集群搭建 ,下面主要介绍fluentd的安装以及配置。
1、执行如下命令(命令将会自动安装td-agent,td-agent即为fluentd),每个需要采集的节点都安装
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh
2、安装插件
td-agent-gem install fluent-plugin-elasticsearch
td-agent-gem install fluent-plugin-grep #过滤插件
3、td-agent.conf配置
注意:java日志采集,请参考:fluentd对java(log4j2)日志多行匹配采集格式化,以下配置format 无法匹配不规则的多行java日志,需要使用多行匹配插件。
或者把替换下面java日志采集的format替换为如下三行:
format multiline
format_firstline /\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3}/
format1 /^(?<access_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3}) (?<level>\S+)\s+\[(?<thread>\S+)\] (?<message>.*)/
#vim /etc/td-agent/td-agent.conf
添加如下内容
#java日志采集
#日志转发到ES存储,供kibana使用
<match dev.**>
@type elasticsearch
host gdg-dev
port 9200
flush_interval 10s
index_name ${tag}-%Y.%m.%d
type_name ${tag}-%Y.%m.%d
logstash_format true
logstash_prefix ${tag}
include_tag_key true
tag_key @log_name
<buffer tag, time>
timekey 1h
</buffer>
</match>
#日志采集并格式化,由于log4j2日志格式不规则,format可以使用none不格式化
<source>
@type tail
format /(?<access_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d+)\s+(?<level>\w*)\s+(?<thread_id>\S*)\s+(?<message>.*)$/
#time_format %Y-%m-%dT%H:%M:%S.%NZ
path /usr/local/logs/bms-api.log
pos_file /var/log/td-agent/bms.api.log.pos
tag dev.bms-api.log
</source>
<source>
@type tail
#time_format %Y-%m-%dT%H:%M:%S.%NZ
format /(?<access_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d+)\s+(?<level>\w*)\s+(?<thread_id>\S*)\s+(?<message>.*)$/
path /usr/local/logs/article-api.log
pos_file /var/log/td-agent/article.api.log.pos
tag dev.article-api.log
</source>
<source>
@type tail
format /(?<access_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d+)\s+(?<level>\w*)\s+(?<thread_id>\S*)\s+(?<message>.*)$/
#time_format %Y-%m-%dT%H:%M:%S.%NZ
path /usr/local/logs/platform-api.log
pos_file /var/log/td-agent/platform.api.log.pos
read_from_head true
tag dev.platform-api.log
</source>
<source>
@type tail
format /(?<access_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d+)\s+(?<level>\w*)\s+(?<thread_id>\S*)\s+(?<message>.*)$/
#time_format %Y-%m-%dT%H:%M:%S.%NZ
path /usr/local/logs/fileserver-api.log
pos_file /var/log/td-agent/fileserver.api.log.pos
read_from_head true
tag dev.fileserver-api.log
</source>
#nginx日志采集设置
#nginx日志采集
<source>
@type tail
#format nginx
format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<request_time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>[^ ]+))?)?$/
time_format %d/%b/%Y:%H:%M:%S %z
path /var/log/nginx/*web.access.log
pos_file /var/log/td-agent/nginx.test.web.access.log.pos
read_from_head true
tag nginx.test.web.access
</source>
<source>
@type tail
#format nginx
format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<request_time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>[^ ]+))?)?$/
path /var/log/nginx/*web.access.log
time_format %d/%b/%Y:%H:%M:%S %z
pos_file /var/log/td-agent/nginx.test.api.access.log.pos
read_from_head true
tag nginx.test.api.access
</source>
<source>
@type tail
format /^(?<request_time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<log_level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<message>.*)$/
path /var/log/nginx/*web.error.log
pos_file /var/log/td-agent/nginx.test.web.error.log.pos
read_from_head true
tag nginx.test.web.error
</source>
<match nginx.**>
@type elasticsearch
host gdg-test
port 9200
flush_interval 10s
index_name ${tag}-%Y.%m.%d
type_name ${tag}-%Y.%m.%d
logstash_format true
logstash_prefix ${tag}
#include_tag_key true
#tag_key @log_name
<buffer tag, time>
timekey 1h
</buffer>
</match>
配置完以后,重启fluentd
/etc/init.d/td-agent restart
过一段时间查看ES日志索引
然后通过kibana创建索引,对日志进行分析查看
4、配置说明
#把日志同步到elasticsearch存储
#匹配任意数量 <match test**>
#如果需要按文件名称进行统计,可以配置多个<match>和<source>,用tag、index_name type_name进行区分
#elasticsearch插件 设置参考https://docs.fluentd.org/v1.0/articles/out_elasticsearch#index_name-(optional)
<match test.article>
@type elasticsearch
host elasticsearch_host
port 9200
flush_interval 10s
logstash_format true #设置以后index为logstash-日期,代替index_name的值,并且索引添加@timestamp字段记录日志读取时间
logstash_prefix ${tag} #设置以后索引会以tag名称-日期进行命名
index_name ${tag}-%Y.%m.%d
type_name ${tag}-%Y.%m.%d
include_tag_key true #把tag当做字段写入ES
tag_key @log_name
<buffer tag, time> #让index_name ${tag}-%Y.%m.%d 时间格式生效
timekey 1h
</buffer>
</match>
#对日志进行格式化操作
#日志格式2018-12-27 20:41:40,649 ERROR [http-nio-8076-exec-1] com.amd5.community.reward.service.impl.MallServiceImpl.getChosenConsigneeAddress:546 - Not add consignee address about user[id=123] yet
#format 参数none:表示不格式化日志 。 regexp、apache2、apache_error、nginx、syslog、tsv or csv、ltsv、json、none、mulitline https://docs.fluentd.org/v1.0/articles/parser_nginx
#正则表达式验证器http://fluentular.herokuapp.com
<source>
@type tail
format /(?<access_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d+)\s+(?<level>\w*)\s+(?<thread_id>\S*)\s+(?<message>.*)$/
path /var/log/amd5.java.log #确保fluentd有权限读取日志,没有权限会报错,可以将fluentd用户添加在日志属组
pos_file /var/log/td-agent/dev.api.log.pos # 将上次读取的位置记录到此文件中,提前新建此文件,并添修改属主为td-agent
read_from_head true #从头部开始读取日志
time_format %Y-%m-%dT%H:%M:%S.%NZ
tag test.article
</source>
#对采集日志进行过滤 这里只显示level字段包含 ERROR 或者message字段包含error的日志条目,参考https://docs.fluentd.org/v0.12/articles/filter_grep
#对日志添加删除 字段 使用@type record_transformer 参考 https://docs.fluentd.org/v1.0/articles/filter_record_transformer
<filter log**>
@type grep
<regexp>
key level
pattern ERROR
</regexp>
<regexp>
key message
pattern error
</regexp>
</filter>
5、相关命令
#添加模板,所有test*的索引都适用此模板
curl -H "Content-Type: application/json" -XPOST master:9200/_template/duoyun_test -d'
{
"template": "test*",
"order": 1,
"settings": {
"number_of_shards": 1
},
"mappings": {
"test*" : {
"properties": {
"@timestamp":{
"type":"date"
}
}
}
}
}'
#删除所有索引
curl -XDELETE -u elastic:changeme gdg-dev:9200/_all
#删除指定索引
curl -XDELETE -u elastic:changeme gdg-dev:9200/test-2019.01.08
#删除自定义模板
curl -XDELETE master:9200/_template/temp*
#查看template_1 模板
curl -XGET master:9200/_template/template_1
#获取指定索引详细信息
curl -XGET 'master:9200/system-syslog-2018.12?pretty'
#获取所有索引详细信息
curl -XGET master:9200/_mapping?pretty