choerodon-logging安装失败

  • Choerodon平台版本: 0.21

  • 遇到问题的执行步骤:

  • 文档地址:
    https://choerodon.io/zh/docs/installation-configuration/steps/operation/logging/
    helm install c7n/choerodon-logging
    –set fluent-bit.es.host=“elasticsearch.logging”
    –version=0.8.2
    –name=choerodon-logging
    –namespace=logging
    安装后三个相关pod CrashLoopBackOff 状态
    choerodon-logging-57ddc77848-shqgg 0/1 CrashLoopBackOff
    fluent-bit-choerodon-logging-v6wkz 0/1 CrashLoopBackOff
    fluent-bit-choerodon-logging-zqddh 0/1 CrashLoopBackOff

  • 环境信息(如:节点信息):
    一个master,一个node

  • 报错日志:
    choerodon-logging pod的日志信息:
    panic: the server could not find the requested resource

goroutine 1 [running]:
github.com/vinkdong/logging-agent/fluent-bit.(*FluentBit).DiscoveryDeployment(0xc4202b9f80)
/Users/vink/go/src/github.com/vinkdong/logging-agent/fluent-bit/common.go:130 +0x1bf5
main.main()
/Users/vink/go/src/github.com/vinkdong/logging-agent/main.go:64 +0x141

fluent-bit-choerodon 错误信息:
Error: No Input(s) have been defined. Aborting

Fluent-Bit v0.13.6
Copyright © Treasure Data

  • 原因分析:

    提出您分析问题的过程,以便我们能更准确的找到问题所在
    logging-agent 没看到源码,错如日志偏少,不知道问题原因

  • 疑问:

    提出您对于遇到和解决该问题时的疑问
    报这个错是什么原因导致的? 请问怎么处理?

你的集群中没有可收集日志的 pod 时 choerodon-logging 就会处于CrashLoopBackOff

装了那么多pod,应该会有有可收集日志的Pod吧? 可收集日志的Pod有什么特殊的设置吗?

改造chart,在values.yaml 文件中配置日志收集选项

## 日志收集
logs:
  enabled: true
  # 日志收集格式
  parser: spring-boot

在_helpers.tpl中加入如下模板:

{{- define "service.logging.deployment.label" -}}
choerodon.io/logs-parser: {{ .Values.logs.parser | quote }}
{{- end -}}

在deployment.yaml 中添加日志的标签:

metadata:
  name: {{ .Release.Name }}
  labels:
{{- if .Values.logs.enabled }}
{{ include "service.logging.deployment.label" . | indent 4 }}
{{- end }}

NAME READY STATUS RESTARTS AGE LABELS
base-service-5dc64c8597-htmwg 1/1 Running 0 9h choerodon.io/logs-parser=spring-boot,choerodon.io/metrics-port=8031,choerodon.io/release=base-service,choerodon.io/service=base-service,choerodon.io/version=0.21.4,pod-template-hash=5dc64c8597

对base-service增加了 choerodon.io/logs-parser=spring-boot label,但 choerodon-logging 还是处于CrashLoopBackOff

你的问题解决了吗,我遇到了和你同样的错误,同样是在v021版本,如果解决了,麻烦告诉一下法案,不胜感激

我也是一样的错误,按照你所说的添加了还是同样的错误

好的,现在还不知道问题在什么地方,看不到choerodon-logging源码,不知道处理逻辑

贴一下你们的 fluent-bit 的 configmap

 kubectl get cm -n logging fluent-bit-conf-choerodon-logging -o yaml

apiVersion: v1
data:
fluent-bit.conf: |+
#;this file is generate by flb api server
#;api:start
#;api:filter
[FILTER]
Name kubernetes
Match kube.*
#;api:end

#;api:start
#;api:output
[OUTPUT]
    Name                        es
    Match                       *
    Host                        ${ES_HOST}
    Port                        ${ES_PORT}
    Index                       es_index
    Type                        es
#;api:end

#;api:start
#;api:service
[SERVICE]
    Flush                       1
    Log_Level                   info
    Parsers_File                parsers.conf
#;api:end

parsers.conf: |
[PARSER]
Name apache
Format regex
Regex ^(?[^ ]) [^ ] (?[^ ]) [(?[^]])] “(?\S+)(?: +(?[^”]?)(?: +\S)?)?" (?[^ ]) (?[^ ])(?: “(?[^”])" “(?[^”])")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache2
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache_error
    Format regex
    Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

[PARSER]
    Name   nginx
    Format regex
    Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   json-test
    Format json
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[PARSER]
    Name        MicroServiceUI
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[PARSER]
    Name        Application
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[PARSER]
    Name        Mobile
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[PARSER]
    Name        spring-boot
    Format      regex
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Decode_Field_as escaped_utf8 msg
    Regex       (?m-ix)^{"log":"(?<logtime>[^ ]* [^ ]*) {1,2}(?<level>[^ ]*)( \[(?<appname>[^,]*),(?<traceid>[^,]*),(?<spanid>[^,]*),(?<exportable>[^\]]*)\])? (?<processid>\d+) --- \[(?<thread>[^\]]*)\] (?<class>[^ ]*) *:(?<msg>.*)","stream".*time":"(?<time>[^"]*)\"}$

[PARSER]
    Name        java-spring
    Format      regex
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Regex       (?m-ix)^{"log":"(?<logtime>[^ ]* [^ ]*) {1,2}(?<level>[^ ]*)( \[(?<appname>[^,]*),(?<traceid>[^,]*),(?<spanid>[^,]*),(?<exportable>[^\]]*)\])? (?<processid>\d+) --- \[(?<thread>[^\]]*)\] (?<class>[^ ]*) *:(?<msg>.*)","stream".*time":"(?<time>[^"]*)\"}$

[PARSER]
    Name        springnotrace
    Format      regex
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Regex       (?m-ix)^{"log":"(?<logtime>[^ ]* [^ ]*)  (?<level>[^ ]*) (?<process_id>[\d]+) \-\-\- \[(?<span_id>[^\]]+)\](?<msg>.*)","stream".*time":"(?<time>[^"]*)\"}$

[PARSER]
    Name   ErrLines
    Format regex
    Decode_Field_as escaped_utf8 errs
    Regex  (?<errs>(?<=:\").*(?=\","stream))

[PARSER]
    Name        syslog
    Format      regex
    Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
    Time_Key    time
    Time_Format %b %d %H:%M:%S

kind: ConfigMap
metadata:
creationTimestamp: “2020-04-01T18:36:21Z”
name: fluent-bit-conf-choerodon-logging
namespace: logging
resourceVersion: “1574514”
selfLink: /api/v1/namespaces/logging/configmaps/fluent-bit-conf-choerodon-logging
uid: a72c26bc-d305-4a14-85f8-d2ecef468da6

首先当正确配置后 fluent-bit 后会新增 [INPUT]

    #;api:start
    #;api:c7ncd-staging:manager-service-abeac
    [INPUT]
        Name                        tail
        Path                        /var/log/containers/manager-service-abeac-*_c7ncd-staging_manager-service-abeac-*.log
        Mem_Buf_Limit               5MB
        Multiline                   On
        Parser_Firstline            java-spring
        Parser_N                    ErrLines
        Tag                         kube.manager-service-abeac.*
    #;api:end

请确认是否在 deployment 中正确配置

[root@xxx ~]# kubectl get deploy -n c7ncd-staging manager-service-abeac -o yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
...
  labels:
    choerodon.io: 0.21.1
    choerodon.io/application: manager-service
    choerodon.io/logs-parser: java-spring
    choerodon.io/release: manager-service-abeac
    choerodon.io/version: 0.21.0

您指的是自己的微服务配置吗

kubectl get deploy -n c7n-system base-service -o yaml | more

kubectl get cm -n logging fluent-bit-conf-choerodon-logging -o yaml | more
apiVersion: v1
data:
fluent-bit.conf: |+
#;this file is generate by flb api server
#;api:start
#;api:filter
[FILTER]
Name kubernetes
Match kube.*
#;api:end

#;api:start
#;api:output
[OUTPUT]
    Name                        es
    Match                       *
    Host                        ${ES_HOST}
    Port                        ${ES_PORT}
    Index                       es_index
    Type                        es
#;api:end

#;api:start
#;api:service
[SERVICE]
    Flush                       1
    Log_Level                   info
    Parsers_File                parsers.conf
#;api:end

parsers.conf: |
[PARSER]
Name apache
Format regex
Regex ^(?[^ ]) [^ ] (?[^ ]) [(?[^]])] “(?\S+)(?: +(?[^”]?)(?: +\S)?)?" (?[^ ]) (?[^ ])(?: “(?[^”])" “(?[^”])")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
还是出不来input段的信息

这个问题你解决了吗?

这个问题还有后续的解决方案吗,日志labels有配置,但是问题依然存在的

请问一下你的 docker 的是什么data-root

cat /etc/docker/daemon.json
{
 ....
 "log-driver": "json-file",
 "log-level": "warn",
 "log-opts": {
   "max-size": "10m",
   "max-file": "3"
   },
 "data-root": "/var/lib/docker",
 "exec-opts": ["native.cgroupdriver=systemd"],
 "storage-driver": "overlay2",
 "storage-opts": [
   "overlay2.override_kernel_check=true"
 ]

如果不是 /var/lib/docker 也是无法收集到的,请配置 fluent-bit.docker.data 为你的docker data-root 目录

{
“registry-mirrors”: [“https://dockerhub.azk8s.cn”,“https://quay.azk8s.cn”],
“insecure-registries”: [“10.244.0.0/18”,“10.244.64.0/18”],
“max-concurrent-downloads”: 10,
“log-driver”: “json-file”,
“log-level”: “warn”,
“log-opts”: {
“max-size”: “10m”,
“max-file”: “3”
},
“data-root”: “/var/lib/docker”,
“exec-opts”: [“native.cgroupdriver=systemd”],
“storage-driver”: “overlay2”,
“storage-opts”: [
“overlay2.override_kernel_check=true”
]
}

对了 忘了告诉你 我没有备案域名,我是通过vpn访问的服务器设置的域名。是不是这个原因?是的话,有什么解决方案?

我卸载了所有pod和命名空间,用一键部署从头安装了一遍,到日志这还是错了。我很奇怪,只有我们两个遇到这问题了吗?v021

1 个赞

@Vista @zhangxl3025 @zhengqianwu 三个大佬 我安装的 0.21 版本 ,也是这个问题? 请问解决了吗? 指导下解决思路