當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

Logging Operator - 优雅的云原生日志管理方案 (一)

發(fā)布時(shí)間：2024/3/24 编程问答 46 豆豆

生活随笔收集整理的這篇文章主要介紹了 Logging Operator - 优雅的云原生日志管理方案 (一) 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Logging Operator是BanzaiCloud下開(kāi)源的一個(gè)云原生場(chǎng)景下的日志采集方案。之前小白轉(zhuǎn)載過(guò)崔大佬介紹的一篇文章，不過(guò)由于之前一直認(rèn)為在單個(gè)k8s集群下同時(shí)管理Fluent bit和Fluentd兩個(gè)服務(wù)在架構(gòu)上比較臃腫，便留下了一個(gè)不適用的初步印象。后來(lái)小白在一個(gè)在多租戶場(chǎng)景下對(duì)k8s集群的日志管理做方案時(shí)，發(fā)現(xiàn)將日志配置統(tǒng)一管理的傳統(tǒng)方式靈活性非常的弱。通常操作者會(huì)站在一個(gè)全局的角度，盡量的讓日志的配置做成模版來(lái)適配業(yè)務(wù)，久而久之模版就變得非常龐大且臃腫，對(duì)后續(xù)維護(hù)和接任者都帶來(lái)了不小挑戰(zhàn)。

直到這段時(shí)間研究了Logging Operator之后，發(fā)現(xiàn)原來(lái)用Kubernetes的方式管理日志是非常愜意的一件事情。在開(kāi)啟之前我們先來(lái)看看它的架構(gòu)。

可以看到Logging Operator利用CRD的方式介入了日志從采集、路由、輸出這三個(gè)階段的配置。它本質(zhì)上來(lái)說(shuō)還是利用DaemonSet和StatefulSet在集群內(nèi)分別部署了FluentBit和Fluentd兩個(gè)組件，FluentBit將容器日志采集并初步處理后轉(zhuǎn)發(fā)給Fluentd做進(jìn)一步的解析和路由，最終由Fluentd將日志結(jié)果轉(zhuǎn)發(fā)給不同的服務(wù)。

所以服務(wù)容器化后，日志的輸出標(biāo)準(zhǔn)到底是該打印到標(biāo)準(zhǔn)輸出還是落盤(pán)到文件，我們可以討論下。

除了管理日志工作流外，Logging Operator還可以讓管理者開(kāi)啟TLS來(lái)加密日志在集群內(nèi)部的網(wǎng)絡(luò)傳輸，以及默認(rèn)集成了ServiceMonitor來(lái)暴露日志采集端的狀態(tài)。當(dāng)然最重要的還是由于配置CRD化，我們的日志策略終于可以實(shí)現(xiàn)在集群內(nèi)的多租戶管理了。

1.Logging Operator CRD

整個(gè)Logging Operator的核心CRD就只有5個(gè)，它們分別是

logging：用于定義一個(gè)日志采集端(FleuntBit)和傳輸端(Fleuntd)服務(wù)的基礎(chǔ)配置；
flow：用于定義一個(gè)namespaces級(jí)別的日志過(guò)濾、解析和路由等規(guī)則。
clusterflow：用于定義一個(gè)集群級(jí)別的日志過(guò)濾、解析和路由等規(guī)則。
output：用于定義namespace級(jí)別的日志的輸出和參數(shù)；
clusteroutput：用于定義集群級(jí)別的日志輸出和參數(shù)，它能把被其他命名空間內(nèi)的flow關(guān)聯(lián)；

通過(guò)這5個(gè)CRD，我們就可以自定義出一個(gè)Kubernetes集群內(nèi)每個(gè)命名空間中的容器日志流向

2.Logging Operator 安裝

Logging Operator 依賴Kuberentes1.14之后的版本，可以分別用helm和mainfest兩種方式安裝。

Helm(v3.21.0+)安裝

$ helm repo add banzaicloud-stable https://kubernetes-charts.banzaicloud.com$ helm repo update$ helm upgrade --install --wait --create-namespace --namespace logging logging-operator banzaicloud-stable/logging-operator \--set createCustomResource=false"

Mainfest安裝

$ kubectl create ns logging# RBAC $ kubectl -n logging create -f https://raw.githubusercontent.com/banzaicloud/logging-operator-docs/master/docs/install/manifests/rbac.yaml# CRD $ kubectl -n logging create -f https://raw.githubusercontent.com/banzaicloud/logging-operator/master/config/crd/bases/logging.banzaicloud.io_clusterflows.yaml $ kubectl -n logging create -f https://raw.githubusercontent.com/banzaicloud/logging-operator/master/config/crd/bases/logging.banzaicloud.io_clusteroutputs.yaml $ kubectl -n logging create -f https://raw.githubusercontent.com/banzaicloud/logging-operator/master/config/crd/bases/logging.banzaicloud.io_flows.yaml $ kubectl -n logging create -f https://raw.githubusercontent.com/banzaicloud/logging-operator/master/config/crd/bases/logging.banzaicloud.io_loggings.yaml $ kubectl -n logging create -f https://raw.githubusercontent.com/banzaicloud/logging-operator/master/config/crd/bases/logging.banzaicloud.io_outputs.yaml# Operator $ kubectl -n logging create -f https://raw.githubusercontent.com/banzaicloud/logging-operator-docs/master/docs/install/manifests/deployment.yaml

當(dāng)安裝完成后，我們需要驗(yàn)證下服務(wù)的狀態(tài)

# Operator狀態(tài) $ kubectl -n logging get pods NAME READY STATUS RESTARTS AGE logging-logging-operator-599c9cf846-5nw2n 1/1 Running 0 52s# CRD狀態(tài) $ kubectl get crd |grep banzaicloud.io NAME CREATED AT clusterflows.logging.banzaicloud.io 2021-03-25T08:49:30Z clusteroutputs.logging.banzaicloud.io 2021-03-25T08:49:30Z flows.logging.banzaicloud.io 2021-03-25T08:49:30Z loggings.logging.banzaicloud.io 2021-03-25T08:49:30Z outputs.logging.banzaicloud.io 2021-03-25T08:49:30Z

3.Logging Operator 配置

3.1 logging

LoggingSpec

LoggingSpec定義了收集和傳輸日志消息的日志基礎(chǔ)架構(gòu)服務(wù)，其中包含F(xiàn)luentd和Fluent-bit的配置。它們都部署在controlNamespace指定的命名空間內(nèi)。一個(gè)簡(jiǎn)單的樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simplenamespace: logging spec:fluentd: {}fluentbit: {}controlNamespace: logging

這份樣例告訴了Operator在logging命名空間內(nèi)創(chuàng)建一個(gè)默認(rèn)配置的日志服務(wù)，其中包含F(xiàn)luentBit和Fluentd兩個(gè)服務(wù)

當(dāng)然實(shí)際上我們?cè)谏a(chǎn)環(huán)境上部署FluentBit和Fluentd不會(huì)只用默認(rèn)的配置，通常我們要考慮很多方面，比如：

自定義鏡像
日志采集位點(diǎn)文件的數(shù)據(jù)持久化
Buffer數(shù)據(jù)持久化
CPU/內(nèi)存資源限制
狀態(tài)監(jiān)控
Fluentd副本數(shù)以及負(fù)載均衡
網(wǎng)絡(luò)參數(shù)優(yōu)化
容器運(yùn)行安全

好在Loggingspec里對(duì)上述支持得都比較全面，我們可以參考文檔來(lái)個(gè)性化定制自己的服務(wù)

小白挑幾個(gè)重要的字段說(shuō)明下用途：

watchNamespaces

制定讓Operator監(jiān)聽(tīng)Flow和OutPut資源的命名空間，如果你是多租戶場(chǎng)景，且每個(gè)租戶都用logging定義了日志架構(gòu)化，可以用watchNamespaces來(lái)關(guān)聯(lián)租戶的命名空間來(lái)縮小資源過(guò)濾范圍

allowClusterResourcesFromAllNamespaces

ClusterOutput和ClusterFlow這樣的全局資源默認(rèn)只在controlNamespace關(guān)聯(lián)的命名空間中生效，如果在其他命名空間中定義都會(huì)被忽略，除非將allowClusterResourcesFromAllNamespaces設(shè)置為true

LoggingSpec描述文檔：https://banzaicloud.com/docs/one-eye/logging-operator/configuration/crds/v1beta1/logging_types/

FluentbitSpec

filterKubernetes

用來(lái)獲取日志的Kubernetes元數(shù)據(jù)的插件，使用樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentd: {}fluentbit:filterKubernetes:Kube_URL: "https://kubernetes.default.svc:443"Match: "kube.*"controlNamespace: logging

也可以用disableKubernetesFilter將該功能禁止，樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentd: {}fluentbit:disableKubernetesFilter: truecontrolNamespace: logging

filterKubernetes描述文檔: https://banzaicloud.com/docs/one-eye/logging-operator/configuration/crds/v1beta1/fluentbit_types/#filterkubernetes

inputTail

定義FluentBit的日志tail采集配置，這里面有很多細(xì)節(jié)的參數(shù)來(lái)控制，小白直接貼現(xiàn)在在用的配置樣例：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:inputTail:Skip_Long_Lines: "true"#Parser: dockerParser: criRefresh_Interval: "60"Rotate_Wait: "5"Mem_Buf_Limit: "128M"#Docker_Mode: "true"Docker_Mode: "false

如果Kubernetes集群的容器運(yùn)行時(shí)是Containerd或這其他CRI，就需要把Parser改成cri，同時(shí)禁用Docker_Mode

inputTail描述文檔: https://banzaicloud.com/docs/one-eye/logging-operator/configuration/crds/v1beta1/fluentbit_types/#inputtail

buffers

定義了FluentBit的緩沖區(qū)設(shè)置，這個(gè)比較重要。由于FluentBit是以DaemonSet的方式部署在Kubernetes集群中，所以我們可以直接采用hostPath的卷掛載方式來(lái)給它提供數(shù)據(jù)持久化的配置，樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:bufferStorage:storage.backlog.mem_limit: 10Mstorage.path: /var/log/log-bufferbufferStorageVolume:hostPath:path: "/var/log/log-buffer"

bufferStorage描述文檔: https://banzaicloud.com/docs/one-eye/logging-operator/configuration/crds/v1beta1/fluentbit_types/#bufferstorage

positiondb

定義了FluentBit采集日志的文件位點(diǎn)信息，同理我們可以用hostPath方式支持，樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:positiondb:hostPath:path: "/var/log/positiondb"

image

提供自定義的FluentBit的鏡像信息，這里我強(qiáng)烈推薦使用FluentBit-1.7.3之后的鏡像，它修復(fù)了采集端眾多網(wǎng)絡(luò)連接超時(shí)的問(wèn)題,它的樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:image:repository: fluent/fluent-bittag: 1.7.3pullPolicy: IfNotPresent

metrics

定義了FluentBit的監(jiān)控暴露端口，以及集成的ServiceMonitor采集定義，它的樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:metrics:interval: 60spath: /api/v1/metrics/prometheusport: 2020serviceMonitor: true

resources

定義了FluentBit的資源分配和限制信息，樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:resources:limits:cpu: "1"memory: 512Mirequests:cpu: 200mmemory: 128Mi

security

定義了FluentBit運(yùn)行期間的安全設(shè)置，其中包含了PSP、RBAC、securityContext和podSecurityContext。他們共同組成控制了FluentBit容器內(nèi)的權(quán)限，它們的樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:security:podSecurityPolicyCreate: trueroleBasedAccessControlCreate: truesecurityContext:allowPrivilegeEscalation: falsereadOnlyRootFilesystem: truepodSecurityContext:fsGroup: 101

性能參數(shù)

這里面定義了FluentBit的一些運(yùn)行性能方面的參數(shù)，其中包含：

1.開(kāi)啟forward轉(zhuǎn)發(fā)上游應(yīng)答相應(yīng)

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:forwardOptions:Require_ack_response: true

2.TCP連接參數(shù)

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:network:connectTimeout: 30keepaliveIdleTimeout: 60

3.開(kāi)啟負(fù)載均衡模式

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:enableUpstream: true

4.調(diào)度污點(diǎn)容忍

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentbit:tolerations:- effect: NoSchedulekey: node-role.kubernetes.io/master

FluentdSpec

buffers

這里主要定義Fluentd的buffer數(shù)據(jù)持久化配置，由于Fluentd是以StatefulSet的方式部署的，所以我們用hostPath就不太合適，這里我們應(yīng)該用PersistentVolumeCliamTemplate的方式為每一個(gè)fluentd實(shí)例創(chuàng)建一塊專門(mén)的buffer數(shù)據(jù)卷，樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentd:bufferStorageVolume:pvc:spec:accessModes:- ReadWriteOnceresources:requests:storage: 50GistorageClassName: csi-rbdvolumeMode: Filesystem

這里如果不指定storageClassName的話，Operator將通過(guò)StorageClass為default的存儲(chǔ)插件創(chuàng)建pvc

FluentOutLogrotate

定義了Fluentd的標(biāo)準(zhǔn)輸出重定向到文件配置，這主要是為了避免在出現(xiàn)錯(cuò)誤時(shí)Fluentd產(chǎn)生連鎖反應(yīng)，并且錯(cuò)誤消息作為日志消息返回系統(tǒng)生成另一個(gè)錯(cuò)誤，樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentd:fluentOutLogrotate:enabled: truepath: /fluentd/log/outage: 10size: 10485760

這里表達(dá)的意思就是將fluentd日志重定向到/fluentd/log/out目錄，同時(shí)保留10天，文件最大不超過(guò)10M

FluentOutLogrotate描述文檔：https://banzaicloud.com/docs/one-eye/logging-operator/configuration/crds/v1beta1/fluentd_types/#fluentoutlogrotate

Scaling

這里主要定義fluentd的副本數(shù)，如果FluentBit開(kāi)啟UpStraem的支持，調(diào)整Fluentd的副本數(shù)會(huì)導(dǎo)致FluentBit滾動(dòng)更新，它的樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentd:scaling:replicas: 4

scaling描述文檔： https://banzaicloud.com/docs/one-eye/logging-operator/configuration/crds/v1beta1/fluentd_types/#fluentdscaling

Worker

這里定義了Fluentd內(nèi)部的Worker數(shù)量，由于Fluentd受限于ruby，它還是以單進(jìn)程的方式處理日志工作流，增加worker數(shù)可以顯著提高Fluentd的并發(fā)，樣例如下：

apiVersion: logging.banzaicloud.io/v1beta1 kind: Logging metadata:name: default-logging-simple spec:fluentd:workers: 2

當(dāng)Worker數(shù)大于1時(shí)，Operator-3.9.2之前的版本，對(duì)Fluentd的buffer數(shù)據(jù)持久化存儲(chǔ)不夠友好，可能會(huì)造成Fluentd容器Crash

image

定義了Fluentd的鏡像信息，這里必須要用Logging Operator定制的鏡像，可以自定義鏡像版本，結(jié)構(gòu)和FluetnBit類似。

security

定義了Fluentd運(yùn)行期間的安全設(shè)置，其中包含了PSP、RBAC、securityContext和podSecurityContext，結(jié)構(gòu)和FluentBit類似。

metrics

定義了Fluentd的監(jiān)控暴露端口，以及集成的ServiceMonitor采集定義，結(jié)構(gòu)和FluentBit類似。

resources

定義了Fluentd的資源分配和限制信息，結(jié)構(gòu)和FluentBit類似。

階段性總結(jié)

本文介紹了Logging Operator的架構(gòu)、部署和CRD的相關(guān)內(nèi)容，同時(shí)詳細(xì)描述了logging的定義和重要參數(shù)。當(dāng)我們要將Operator用于生產(chǎn)環(huán)境采集日志時(shí)，它們會(huì)變得非常重要，請(qǐng)讀者在使用前一定好好參考文檔。

由于Logging Opeator的內(nèi)容非常多，我將在后面幾期更新Flow、ClusterFlow、Output、ClusterOutput以及各種Plugins的使用，請(qǐng)大家持續(xù)關(guān)注

微信關(guān)注公眾號(hào)「云原生小白」，回復(fù)【入群】進(jìn)入Loki學(xué)習(xí)群

總結(jié)

以上是生活随笔為你收集整理的Logging Operator - 优雅的云原生日志管理方案 (一)的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。