當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

filebeat + es 日志分析

發布時間：2025/3/17 编程问答 25 豆豆

生活随笔收集整理的這篇文章主要介紹了 filebeat + es 日志分析小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

官網下載filebeat

下載及介紹，略。注意，保持fielbeat和es的版本一致，否則可能會報錯。

配置filebeat.yml

主要是：

日志文件路徑

單條日志多行合并

kibana和es連接

可以參考官網：https://www.elastic.co/guide/en/beats/filebeat/6.3/index.html

下面是我的配置：

###################### Filebeat Configuration Example ########################## This file is an example configuration file highlighting only the most common # options. The filebeat.reference.yml file from the same directory contains all the # supported options with more comments. You can use it as a reference. # # You can find the full configuration reference here: # https://www.elastic.co/guide/en/beats/filebeat/index.html# For more available modules and options, please see the filebeat.reference.yml sample # configuration file.#=========================== Filebeat inputs =============================filebeat.inputs:# Each - is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations.- type: logenabled: truepaths:- /xxx/*/*/*.log- /xxx/xxx/*/*/*.log# exclude_files: ['/home/zile/prod-log/gateway/*/*/*/*']ignore_older: 12htail_files: false# Optional additional fields. These fields can be freely picked# to add additional information to the crawled log files for filtering### Multiline options 多行處理，匹配以日期開頭的multiline.pattern: ^\[?\d{4}\-\d{2}\-\d{2}multiline.negate: truemultiline.match: after#============================== Kibana =====================================# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana:# Kibana Host# Scheme and port can be left out and will be set to the default (http and 5601)# In case you specify and additional path, the scheme is required: http://localhost:5601/path# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601host: "https://xxxxx:5601"username: "xxx"password: "xxxx"#================================ Outputs =====================================# Configure what output to use when sending the data collected by the beat.#-------------------------- Elasticsearch output ------------------------------ output.elasticsearch:# Array of hosts to connect to.hosts: ["xxxx:9200"]# Optional protocol and basic auth credentials.protocol: "http"username: "xxx"password: "xxx"# 配置es數據預處理管道，這個可以稍后配置pipeline: log-pipe# ......

啟動filebeat

debug啟動filebeat看看，能否正常運行：

./filebeat -e -d “*"

根據輸出信息，可以方便的排查問題。

配置es管道，對日志數據進行預處理

在日志發送到es之前，利用es的管道提前預處理，分離一些字段出來，方便后續的檢索

官方文檔參考：https://www.elastic.co/guide/en/elasticsearch/reference/6.3/ingest.html

定義管道：kibana開發者工具

grok定義要匹配的日志格式

# set pipeline for common # <pattern>[%d{yyyy-MM-dd HH:mm:ss:SSS}] [ai-course] [%level] [%thread] [%F.%M:%L] - %msg%n</pattern> # <pattern>[%d{yyyy-MM-dd HH:mm:ss:SSS}] [ai-course] %level - %msg%n</pattern> PUT _ingest/pipeline/log-pipe {"description" : "for log","processors": [{"grok": {"field": "message”,"patterns": ["""\[%{TIMESTAMP_ISO8601:log_time}\] \[%{NOTSPACE:server}\] \[%{LOGLEVEL:log_level}\] \[%{NOTSPACE:thread}\] \[%{NOTSPACE:java_class}\] - %{GREEDYDATA:content}""","""\[%{TIMESTAMP_ISO8601:log_time}\] \[%{NOTSPACE:server}\] %{LOGLEVEL:log_level} - %{GREEDYDATA:content}"""],"ignore_failure": true}},{"date": {"field": "log_time","formats": ["yyyy-MM-dd HH:mm:ss:SSS"],"timezone": "Asia/Shanghai","target_field": "@timestamp","ignore_failure": true}}] }

穩定運行

參考之前的博客，supervisor運行elastic search：
https://blog.csdn.net/Ayhan_huang/article/details/100096183

supervisor的安裝可以參考：

centos 7上的倉庫沒有supervirosr，需要先安裝 EPEL Repository

https://cloudwafer.com/blog/how-to-install-and-configure-supervisor-on-centos-7/

supervisor使用參考：https://blog.csdn.net/Ayhan_huang/article/details/79023553

確保filebeat的運行用戶對filebeat目錄擁有權限

supervisor配置完成后，執行以下命令：讓配置生效并運行

systemctl daemon-reload systemctl restart supervisord

可以通過supervisorctl status查看filebeat進程的運行情況

定期清除es中的過期日志索引

默認情況下，filebeat按天創建索引，如果日志量很多的話，會占用es大量存儲空間，可以考慮定時刪除較早的日志

這里用python腳本配合Linux定時任務來搞定：

Linux定時任務參考：https://blog.csdn.net/Ayhan_huang/article/details/72833436

下面是我的python腳本，主要是利用es的如下兩個接口：

get _cat/indices 獲取索引
delete /index_name 刪除索引

""" 檢查ES日志索引，并自動刪除3天前的索引，如果不足3個，不刪除 """ import re import requestsHOST = "xxxx:9200" USERNAME = "xx" PASSWORD = "xxx" LOG_KEEP_DAYS = 3URL = f"http://{USERNAME}:{PASSWORD}@{HOST}"# 獲取索引： # https://www.elastic.co/guide/en/elasticsearch/reference/6.3/cat-indices.html res = requests.get(f"{URL}/_cat/indices/filebeat*?v&s=index") indices = re.findall(r"(filebeat.*?)\s", res.text)if len(indices) <= LOG_KEEP_DAYS:exit(0)# 將索引按照日期排序 indices.sort() print("logs:") print(indices)""" 自然排序后結果示例： [ 'filebeat-6.3.2-1020.11.26', 'filebeat-6.3.2-2019.12.28', 'filebeat-6.3.2-2020.01.31', 'filebeat-6.3.2-2020.11.27', 'filebeat-6.3.2-2020.12.01', 'filebeat-6.3.2-2021.01.01' ] """# 保留最近3天的日志，更早的刪除 dropped_indices = indices[0: len(indices) - LOG_KEEP_DAYS] print("del logs：") print(dropped_indices)# 刪除索引： # https://www.elastic.co/guide/en/elasticsearch/reference/6.3/indices-delete-index.html dropped_indices_str = ','.join(dropped_indices) res = requests.delete(f"{URL}/{dropped_indices_str}") print(res.text)

總結

以上是生活随笔為你收集整理的filebeat + es 日志分析的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：【133】常见问题解答
下一篇：【转】Asp.net控件开发学习笔记整理